[SERVER-64965] Count the number of operations that fail due to timing out waiting to acquire a connection Created: 25/Mar/22 Updated: 29/Oct/23 Resolved: 20/Sep/22 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | 6.2.0-rc0 |
| Type: | Task | Priority: | Major - P3 |
| Reporter: | George Wangensteen | Assignee: | Reo Kimura (Inactive) |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||
| Sprint: | Service Arch 2022-05-16, Service Arch 2022-05-30, Service Arch 2022-06-13, Service Arch 2022-06-27, Service Arch 2022-07-11, Service Arch 2022-07-25, Service Arch 2022-08-08, Service Arch 2022-08-22, Service Arch 2022-09-05, Service Arch 2022-09-19, Service Arch 2022-10-03 | ||||||||
| Participants: | |||||||||
| Description |
|
When 'bursts' of operations occur that all require access to a connection to perform some RPC, our connection pools don't always have enough pooled connections to service all of the operations. In this case, operations get bottlenecked behind connection establishment. In more extreme cases, operations will fail due to reaching their max time ms limit while waiting to acquire a connection. To better understand when our connection pooling infrastructure is related to user-facing workload degradation, let's add a counter to count how many operations fail due to timing out waiting to acquire a connection. This counter should be reported in FTDC. Additionally, let's make sure we log how long operations that fail for this reason spent waiting to acquire a connection, so we can check that an unreasonable amount of time was spent waiting. |
| Comments |
| Comment by Githook User [ 20/Sep/22 ] |
|
Author: {'name': 'Reo Kimura', 'email': 'reo.kimura@mongodb.com', 'username': 'rkimura21'}Message: |