[COMPASS-7575] Investigate changes in SERVER-84440: Expose the number of replication waiters in serverStatus Created: 11/Jan/24 Updated: 22/Jan/24 Resolved: 16/Jan/24 |
|
| Status: | Closed |
| Project: | Compass |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | No version |
| Type: | Investigation | Priority: | Major - P3 |
| Reporter: | Backlog - Core Eng Program Management Team | Assignee: | Rhys Howell |
| Resolution: | Done | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||
| Assigned Teams: |
Developer Tools
|
||||||||
| Documentation Changes: | Not Needed | ||||||||
| Description |
|
Original Downstream Change Summary This adds two metrics to the serverStatus.metrics section:
repl.waiters.replication exposes how many threads are waiting for a replicated and/or journaled write concern to resolve. repl.waiters.opTime exposes how many threads are waiting for a local optime only. Description of Linked TicketThe replication waiters list can grow with the number of operations waiting for write concern. Advancing replication timestamps also requires updating all waiters in this list under a mutex. If the list is long, this can take a long time. It would be useful to be able to see how many operations are waiting for replication in this state, which would make it easier to diagnose problems in this area. |
| Comments |
| Comment by Rhys Howell [ 16/Jan/24 ] |
|
No devtools product changes needed, we expose the results of serverStatus in the shell, and in Compass we show some information using it in the performance page. We don't do any parsing that this would impact and it doesn't sound like we want to add explicit parsing to show this. |
| Comment by PM Bot [ 11/Jan/24 ] |
|
Fix Version updated for upstream |