[SERVER-55055] Add cumulative metric for the total refreshed connections Created: 09/Mar/21  Updated: 29/Oct/23  Resolved: 13/Jan/22

Status: Closed
Project: Core Server
Component/s: Internal Code
Affects Version/s: None
Fix Version/s: 5.3.0

Type: Improvement Priority: Major - P3
Reporter: Amirsaman Memaripour Assignee: Daniel Morilha (Inactive)
Resolution: Fixed Votes: 0
Labels: servicearch-wfbf-day
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: PNG File always-refresh-after-timeout.png     PNG File checkout-delay-1.png     PNG File checkout-delay-2.png     PNG File checkout-delay-3.png     PNG File connection-refresh.png     PNG File no-refresh-for-active.png     PNG File pending-refresh.png    
Issue Links:
Problem/Incident
Related
Backwards Compatibility: Fully Compatible
Sprint: Service Arch 2021-03-22, Service Arch 2021-04-05, Service Arch 2021-04-19, Service Arch 2021-06-14, Service Arch 2021-06-28, Service Arch 2021-07-12, Service Arch 2022-1-10, Service Arch 2022-1-24
Participants:
Linked BF Score: 173
Story Points: 3

 Description   

One of the metrics reported by the ConnectionPool is totalRefreshing, which refers to the number of connections that are scheduled for a refresh. Connections may get scheduled for a refresh due to hitting a timeout (see here), or after getting returned to the pool (see here).

It appears that the connection pool may not accurately report the number of connections scheduled for a refresh. For example, the following shows the increase in the number of hello commands (due to refreshing connections), but does not accurately report totalRefreshing.

This ticket should investigate this possibility and provide fixes if necessary.

This ticket should introduce a new, cumulative metric that tracks the total number of refreshed connections (e.g., totalRefreshed).



 Comments   
Comment by Githook User [ 13/Jan/22 ]

Author:

{'name': 'Daniel Vitor Morilha', 'email': 'daniel.morilha@mongodb.com', 'username': 'daniel-mdb'}

Message: SERVER-55055 Add cumulative metric for the total refreshed connections
Branch: master
https://github.com/mongodb/mongo/commit/facb8a30715fcb91c73a525aa0f6f5c0e1f83aa1

Comment by Daniel Morilha (Inactive) [ 11/Jan/22 ]

Addressing potential test failure with multiversion variant where the introduced connection pool stats totalRefereshed isn't present prior to the unreleased version 5.3.

PR: https://github.com/10gen/mongo/pull/2378

Evergreen patch: https://spruce.mongodb.com/version/61ddf28832f4170764bce814/tasks?sorts=STATUS%3AASC%3BBASE_STATUS%3ADESC

Comment by Githook User [ 10/Jan/22 ]

Author:

{'name': 'Daniel Vitor Morilha', 'email': 'daniel.morilha@mongodb.com', 'username': 'daniel-mdb'}

Message: Revert "SERVER-55055 Add cumulative metric for the total refreshed connections"

This reverts commit 4ae8b3f29485b8a25877b8fd7e67787e9b3996de.
Branch: master
https://github.com/mongodb/mongo/commit/14a754e2511ff668b4d61e7598ec3ce088cdcc9a

Comment by Githook User [ 06/Jan/22 ]

Author:

{'name': 'Daniel Vitor Morilha', 'email': 'daniel.morilha@mongodb.com', 'username': 'daniel-mdb'}

Message: SERVER-55055 Add cumulative metric for the total refreshed connections
Branch: master
https://github.com/mongodb/mongo/commit/4ae8b3f29485b8a25877b8fd7e67787e9b3996de

Comment by Daniel Morilha (Inactive) [ 30/Dec/21 ]

GH PR : SERVER-55055 Add cumulative metric for the total refreshed connections by daniel-mdb · Pull Request #2378 · 10gen/mongo (github.com)

Comment by Amirsaman Memaripour [ 11/Mar/21 ]

I wasn't able to find any inconsistencies in values reported by totalRefreshing during my investigations. To address possible sampling issues for values reported for connections pending a refresh, my recommendation is to add a new, cumulative metric to connection pools that tracks the total number of refreshed connections (totalRefreshed).

Comment by Bruce Lucas (Inactive) [ 09/Mar/21 ]

It might be useful to add a "totalRefreshed" metric as you did in the associated HELP ticket. Generally cumulative metrics are more useful for performance investigations than instantaneous metrics because they're less subject to sampling error.

Generated at Thu Feb 08 05:35:19 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.