[SERVER-47267] Exhaust isMaster replies are logged as slow queries Created: 02/Apr/20 Updated: 29/Oct/23 Resolved: 17/Apr/20 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Replication |
| Affects Version/s: | None |
| Fix Version/s: | 4.4.0-rc4, 4.7.0 |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | A. Jesse Jiryu Davis | Assignee: | A. Jesse Jiryu Davis |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||||||||||||||||||||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||||||||||||||||||||||||||||||
| Backport Requested: |
v4.4
|
||||||||||||||||||||||||||||||||||||
| Sprint: | Repl 2020-04-20 | ||||||||||||||||||||||||||||||||||||
| Participants: | |||||||||||||||||||||||||||||||||||||
| Case: | (copied to CRM) | ||||||||||||||||||||||||||||||||||||
| Description |
|
With the new streaming isMaster protocol, it seems that isMaster replies are logged as slow queries. Is this intentional or shall we suppress these messages? |
| Comments |
| Comment by Githook User [ 01/May/20 ] |
|
Author: {'name': 'A. Jesse Jiryu Davis', 'email': 'jesse@mongodb.com', 'username': 'ajdavis'}Message: (cherry picked from commit 7dcdf00a34e6b030d63001c7586cbc161ef5229a) |
| Comment by Githook User [ 17/Apr/20 ] |
|
Author: {'name': 'A. Jesse Jiryu Davis', 'email': 'jesse@mongodb.com', 'username': 'ajdavis'}Message: |
| Comment by A. Jesse Jiryu Davis [ 06/Apr/20 ] |
|
Waiting for |
| Comment by A. Jesse Jiryu Davis [ 04/Apr/20 ] |
|
There's no existing mechanism either to set an individual command's slow query threshold, nor is there one for suppressing slow query logging. Since I have to build something new anyway, I'll follow the suggestion of specifically logging slow queries for isMaster if it takes 2x the expected max (2 * 10 seconds). |
| Comment by Bruce Lucas (Inactive) [ 03/Apr/20 ] |
|
OK, yeah I guess that's too much. Agreed it's simpler and info would be less useful, but, still, things can go wrong... If there's a mechanism already in place to set the logging threshold individually for this command then it might be useful to set it to say 2x max expected, i.e. 20 seconds. Otherwise I think it would be reasonable to suppress it - is there a mechanism for suppressing slow query logging for individual commands? |
| Comment by A. Jesse Jiryu Davis [ 03/Apr/20 ] |
|
Yes, every 10 seconds for every connection. And when there's a topology change (such as a reconfig or stepdown) then there will be a message for every client monitoring connection that has been waiting longer than the slow query threshold since the last reply, which will be approximately all of them. Compared to tailable cursors, there are a much larger number of monitoring connections that are waiting for isMaster replies. I also think performance information about isMaster is much less useful compared to performance information about tailable cursors - isMaster is much simpler and doesn't involve the query or storage systems. |
| Comment by Bruce Lucas (Inactive) [ 03/Apr/20 ] |
|
I believe tailable cursors, e.g. changestreams, do log the wait time, and this has I believe been useful information. In this case I think it could also be useful - for example, suppose something goes wrong and the isMaster reply takes 20 seconds instead of 10 - that I think would be useful information. However I do see the point about a large number of them - will this be every 10 seconds for every connection, or something else? |
| Comment by A. Jesse Jiryu Davis [ 03/Apr/20 ] |
|
Right, it's the time the server waited before replying to the client's isMaster request. I think it shouldn't be logged as a slow query, however. The server is supposed to wait up to 10 seconds before replying, if there has been no topology change in the meantime, so the long duration doesn't indicate a problem. This will produce a large number of log messages which I don't think are useful to anyone. Do tailable cursors (capped collections or change streams) write slow query messages if they wait for new documents for longer than the slow query threshold? |
| Comment by Bruce Lucas (Inactive) [ 03/Apr/20 ] |
|
What does the time reported represent? If it's duration between some request from a client and a response, i.e. this is a delay that would be experienced by a client, then they would seem to be appropriate. |