[SERVER-55638] Diagnostics for time spent sending responses to ingress queries Created: 30/Mar/21 Updated: 20/Sep/22 Resolved: 20/Sep/22 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Diagnostics |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Improvement | Priority: | Major - P3 |
| Reporter: | Bruce Lucas (Inactive) | Assignee: | Matt Diener (Inactive) |
| Resolution: | Won't Do | Votes: | 3 |
| Labels: | sa-remove-fv-backlog-22 | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||
| Sprint: | Service Arch 2022-06-13, Service Arch 2022-06-27, Service Arch 2022-07-25, Service Arch 2022-08-08, Service Arch 2022-08-22, Service Arch 2022-09-19, Service Arch 2022-10-03 | ||||||||
| Participants: | |||||||||
| Case: | (copied to CRM) | ||||||||
| Description |
|
Sending the response for an operation to the client is done after we finish timing the operation, so if there is a delay due to network congestion this does not show up in logged slow queries nor in active connections, making it difficult to diagnose. Some suggested improvements:
|
| Comments |
| Comment by Matt Diener (Inactive) [ 20/Sep/22 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
SERVER-63883 and The end result is the same, this change is being made to better reflect the work that's being done. I prefer keeping the original tickets' descriptions as-is for future reference. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Daniel Morilha (Inactive) [ 29/Jun/22 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
After some more discussion within the team, I was recommended to drive away from the histograms as they might bring a considerable overhead for no good justification and instead focus on easy to digest FTDC aggregate metrics plus the log line in case a write operation goes beyond a parameterized threshold. I've also used the opportunity to, following what Sam and Billy recommended, layout a new structure for future ServiceEntryPoint / ServiceStateMachine stats. During the meeting, we informally went through performance requirements for reporting metrics and how their added contention might have an undesired impact into other projects. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Bruce Lucas (Inactive) [ 23/Jun/22 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
daniel.morilha@mongodb.com provided the histograms are excluded from FTDC as discussed that sounds fine to me. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Daniel Morilha (Inactive) [ 23/Jun/22 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
bruce.lucas@mongodb.com after taking into consideration what others have suggested I decided to go with two histograms to breakdown how long the completed write operations took as well as how long write in progress operations are taken so far. Besides we can still keep a log message for connections which go above a certain threshold. Please let me know what you think about this, and if you wish to see anything extra out of this effort. Thanks Here's the progress so far: https://github.com/10gen/mongo/compare/master...daniel-mdb:daniel.morilha/SERVER-55638?expand=1 | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Daniel Morilha (Inactive) [ 21/Jun/22 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Just met with Sam, he recommended to look into ServiceContext decorations and possibly register the counters there. Will keep it posted here. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Daniel Morilha (Inactive) [ 21/Jun/22 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Thanks billy.donahue@mongodb.com for the thoughtful perspective, it makes sense. I guess the justification for the approach I initially took was related to where these metrics should be persisted and how they need to be synchronized. Previously I thought that in order to guarantee the least amount of contention, storing an AtomicWord into ServiceEntryPointImpl would not only make them require less synchronization but also centralize a metrics that in fact belongs to all Sessions. I definitely agree tracking both numbers of writes started, and writes completed gives a better view over the problem, but the question about where to store them is still valid. Could they be declared statics inside SSM or would they fit better at ServiceEntryPointImpl::Sessions, and if so, how the state machine could communicate with the Sessions instance? | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Billy Donahue [ 17/Jun/22 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
I'm not revamping the whole transport layer. I'm refactoring ServiceStateMachine, which has some collateral effect on service_entry_point_impl which creates and destroys ServiceStateMachine objects. Removing the _state data member of SSM doesn't mean it's stateless. It just means we aren't trying to represent its state in an enum. I don't think the SSMs should be trying to keep a counter in SEPImpl up to date. Most of the time nobody will be looking at it, and it will become a contended atomic when the system is running smoothly. The SEP appendStats function should instead pull the information from a
As to what we're trying to measure. From a queueing perspective, just knowing the total number of sessions that are currently writing does not tell you whether the queue is stuck. It's better if metrics don't decrement. You need to know number of writes started and number of writes completed. The difference between these numbers is how many are in progress, but if that difference remains steady but the numbers are rising, then you would see that there's perhaps a deep queue but that outgoing traffic is arriving and leaving. If there were several truly stuck operations you still wouldn't know about it even with two scalar metrics. You'd need something like a histogram. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Daniel Morilha (Inactive) [ 17/Jun/22 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
billy.donahue@mongodb.com amirsaman.memaripour@mongodb.com here's a proposed patch already on top of Billy's branch you've been discussing. We can meet
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Daniel Morilha (Inactive) [ 17/Jun/22 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
amirsaman.memaripour@mongodb.com I spent a portion of the morning investigating your suggestion and I think tracking the number of ingress sessions that are actively writing is feasible. I will need some help to understand how to go from ServerStateMachine to FTDC metrics. I noticed there is a NetworkCounters global, but that reports in ServerStatus, would it be an alternative instead? | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Daniel Morilha (Inactive) [ 17/Jun/22 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Thanks amirsaman.memaripour@mongodb.com , I will consider your suggestion. In the meantime I am glad to report I ran a modified version of the code overnight to check if I'm going towards the right direction so far. Here are the proposed logs as they appeared from running all core suite (Please be aware slow queries were forced, logs weren't change, and no lines in between were suppressed):
There were 6150 "Slow transmissions" with "durationMillis" equal 1, and 90 where "durationMillis" was between 2 and 9. So in general the difference between both is almost negligible on an idle virtual host. Patch 200 has the same changes. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Bruce Lucas (Inactive) [ 16/Jun/22 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
The log line looks fine as far as it goes. I assume that the "Slow transmission" message will also have an attr.durationMillis attribute? One shortcoming of the log is that it will presumably have a threshold, I imagine the same as slowms, so might not effectively show all issues caused by slow transmission, just as the log does not show all slow query issues. Would it be easy to also do the third option, add a "transmitting" state alongside the "active" and "inactive" connection states? This way a large number of connections in "transmitting" state indicates a particular kind of problem, just as a large number of connections in "active" state does. What information from the slow query log would you reproduce? Doesn't it require an operation context? Or, if it's possible to reproduce it, wouldn't it be possible to include the transmission time in the slow query log itself? | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Daniel Morilha (Inactive) [ 15/Jun/22 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Understood bruce.lucas@mongodb.com, I think we should start with logs measuring the time taken from beginning to the end of a response, and possibly breaking it down in parts. Here's a proposal. slow transmission proposed log:
There's also a lot of information currently logged by the slow query mechanism which I could try to replicate, or log the equivalent for the slow transmissions. current slow query log:
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Daniel Morilha (Inactive) [ 15/Jun/22 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Understood, I think we should start with logs measuring the time taken from beginning to the end of a response, and possibly breaking it down in parts. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Bruce Lucas (Inactive) [ 10/Jun/22 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Delays in response to external clients is the primary concern (although to the extent that the code doesn't distinguish external clients from clients that happen to be internal cluster nodes I imagine this would also cover those cases).
Yes, that's correct.
I suggest three possible formats in my initial comment, more-or-less in order of preference.
Understood, and I suspected as much, so my first suggest might be difficult to implement, is that right? For my second suggestion would the log entry still identify a specific connection ("conn....")? If not it would probably be of limited value. On the whole, I think my third suggestion above may be the most reliable and would still have high diagnostic value: if there is a large number of connections in "transmitting" state it would indicate a bottleck somewhere after operation completion. If this could be further subdivided into other substates ("compressing", "writing SSL headers", ...) that could be useful as well. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Daniel Morilha (Inactive) [ 09/Jun/22 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
bruce.lucas@mongodb.com after spending a day inspecting various parts of the code it sounds reasonable, as amirsaman.memaripour@mongodb.com privately suggested, to better specify the details of how the outcome of this ticket can assist future latency investigations. Currently, "transmitting" refers to more than writing bytes to network buffers (e.g., compression, writing SSL header, polling, etc). Therefore, we need to better specify what is expected from the proposed new metrics. Is there a specific use-case you expect to cover: ingress from external clients, connections between various cluster nodes, nodes acting on specific roles (sharding, configuration, bridge, replica sets, primaries, secondaries?) Any preferences about the format of those metrics? (Only a log if the duration exceeds some configurable threshold, an aggregate metric reported in FTDC) From reading the help ticket, it looks like that specific incident was related with a replica-set taking substantial time writing the result of a completed operation to the network. Is it right? Please keep in mind that due the current design transport components are somewhat isolated from the rest of the code which might present as a barrier to more sophisticated ideas. I had to rely on amirsaman.memaripour@mongodb.com expertise to understand, for example that the current code disposes the operationContext right after an operation completes but before it is written to the network. In parallel I'm trying to run an experiment to identify how much time opportunistic network reads and writes actually spend reading or writing, inconclusive atm. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Daniel Morilha (Inactive) [ 07/Jun/22 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
After meeting with amirsaman.memaripour@mongodb.com we've discussed the second suggestion seems the most straightforward to implement and backport, yet the one which might unlock the most value. I will investigate it next. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Daniel Morilha (Inactive) [ 07/Jun/22 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
From the three suggestions presented at the ticket description:
Tracking a connection state seems the most interesting for later diagnose purposes. The problem lies to the fact isn't clear if the current code would allow discerning from when a connection is sending vs. when is receiving data. To the level of understanding I currently have, there is also the concern that due to the nature of how the code was initially structured, it may so happen a connection is waiting to be processed rather than actively transmitting data, which can frustrate the original goal with the proposed metric. I agree changing a current metric isn't ideal, nonetheless a new metric can be created containing what is currently considered as a slow query duration + its transmission time (given the same aforementioned concerns.) I'll have to vet the idea among other folks in Service Architecture and have their blessing. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Amirsaman Memaripour [ 29/Jan/22 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
shameek.ray, IMO, both PM-2246 and PM-2398 can add visibility to transmission delays. We may add new metrics in PM-2398 for transmission delays and/or augment traces in PM-2246 with the latency of transmitting responses over the wire. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Shameek Ray [ 28/Jan/22 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
amirsaman.memaripour, how do we see this ticket (and its underlying pains) interacting with PM-2246?
|