[SERVER-44522] serverStatus metrics for awaitable isMaster Created: 08/Nov/19 Updated: 08/Jan/24 Resolved: 21/Feb/20 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Replication |
| Affects Version/s: | None |
| Fix Version/s: | 4.3.4 |
| Type: | New Feature | Priority: | Major - P3 |
| Reporter: | A. Jesse Jiryu Davis | Assignee: | Jason Chan |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||||||||||
| Sprint: | Repl 2020-02-10, Repl 2020-02-24 | ||||||||||||||||
| Participants: | |||||||||||||||||
| Linked BF Score: | 20 | ||||||||||||||||
| Description |
|
Add to the "connections" document in serverStatus:
|
| Comments |
| Comment by Githook User [ 21/Feb/20 ] | ||||||||||||||||||
|
Author: {'name': 'Jason Chan', 'username': 'jasonjhchan', 'email': 'jason.chan@mongodb.com'}Message: | ||||||||||||||||||
| Comment by Githook User [ 21/Feb/20 ] | ||||||||||||||||||
|
Author: {'name': 'Ben Caimano', 'email': 'ben.caimano@10gen.com'}Message: Revert " This reverts commit b4915c29d4848439e23857c45fd3fcf94622b015. | ||||||||||||||||||
| Comment by Githook User [ 20/Feb/20 ] | ||||||||||||||||||
|
Author: {'name': 'Jason Chan', 'username': 'jasonjhchan', 'email': 'jason.chan@mongodb.com'}Message: | ||||||||||||||||||
| Comment by Tess Avitabile (Inactive) [ 04/Feb/20 ] | ||||||||||||||||||
|
Thank you for the advice! Your suggestion sounds good to me. | ||||||||||||||||||
| Comment by Mira Carey [ 04/Feb/20 ] | ||||||||||||||||||
|
Some thoughts on that design:
So basically, my pitch is to do:
| ||||||||||||||||||
| Comment by Tess Avitabile (Inactive) [ 03/Feb/20 ] | ||||||||||||||||||
|
The implementation of the exhaustIsMaster metric is in an unfamiliar area of the code for me, so I'd like to post a design here and get feedback before I start typing. mira.carey@mongodb.com, would you be willing to provide feedback on this design? The exhaustIsMaster metric is defined as the number of connections whose last request was an isMaster with exhaustAllowed. I plan to add a class IsMasterStats that contains an AtomicWord<size_t> _exhaustIsMasterConnections (it will also track metrics for awaitingTopologyChanges, but I won't design that here). I will add a new field _exhaustIsMaster to transport::Session. Whenever this field is toggled (including on destruction), I will increment/decrement IsMasterStats::_exhaustIsMasterConnections (alternatively, instead of adding logic to the transport::Session destructor, I can add logic to the ServiceStateMachine cleanup hook). I will also add a field exhaustIsMaster to DbResponse, which will be set in CmdIsMaster::runWithReplyBuilder() on mongod and mongos. ServiceStateMachine::_processMessage will copy the field DbResponse::exhaustIsMaster to transport::Session::_exhaustIsMaster. | ||||||||||||||||||
| Comment by Tess Avitabile (Inactive) [ 29/Jan/20 ] | ||||||||||||||||||
|
I agree that the two metrics as currently designed will help in diagnosing those issues. Thank you for weighing in. | ||||||||||||||||||
| Comment by Bruce Lucas (Inactive) [ 29/Jan/20 ] | ||||||||||||||||||
|
tess.avitabile, I don't have any further insights and I'm happy to defer to Jesse's judgement. | ||||||||||||||||||
| Comment by A. Jesse Jiryu Davis [ 29/Jan/20 ] | ||||||||||||||||||
|
Thinking again, perhaps "number of clients awaiting topology changes without exhaust" is not an important metric, so the sentence "To count the number of clients awaiting topology changes without exhaust, subtract the second metric from the first" is not just inaccurate but also irrelevant. The purpose I imagine for these metrics is to help diagnose any issues that arise from the new protocol. E.g., if we saw exhaustIsMaster growing ever larger we'd suspect the server wasn't detecting client disconnects, or if we saw CPU usage rise in tandem with awaitingTopologyChanges we'd suspect a bug in ReplicationCoordinatorImpl::awaitIsMasterResponse. So I propose we keep these two metrics, awaitingTopologyChanges and exhaustIsMaster, but we don't care about which clients are counted by both or neither at any moment. | ||||||||||||||||||
| Comment by Tess Avitabile (Inactive) [ 28/Jan/20 ] | ||||||||||||||||||
|
I'm concerned that with the way the metrics are specified, exhaustIsMaster will not always be smaller than awaitingTopologyChanges, so subtracting them will not make sense. This is because a connection that is being used for exhaust isMaster request will not always be in the state of waiting for topology changes, for example when it is building a response. We could ensure exhaustIsMaster is at most awaitingTopologyChanges by defining it as "number of clients currently waiting in isMaster (with exhaust) for a topology change". However, the downside would be that these connections blip in and out of this metric, which may be undesirable. jesse, did I understand the intention of these metrics? Do you or bruce.lucas have any guidance on them? | ||||||||||||||||||
| Comment by Bruce Lucas (Inactive) [ 23/Jan/20 ] | ||||||||||||||||||
|
Can you summarize on this ticket the proposed metrics? | ||||||||||||||||||
| Comment by A. Jesse Jiryu Davis [ 08/Nov/19 ] | ||||||||||||||||||
|
Once mongod and mongos have exhaust awaitable isMaster, implement metrics to count the clients that use it. |