[SERVER-47267] Exhaust isMaster replies are logged as slow queries Created: 02/Apr/20  Updated: 29/Oct/23  Resolved: 17/Apr/20

Status: Closed
Project: Core Server
Component/s: Replication
Affects Version/s: None
Fix Version/s: 4.4.0-rc4, 4.7.0

Type: Bug Priority: Major - P3
Reporter: A. Jesse Jiryu Davis Assignee: A. Jesse Jiryu Davis
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Backports
Depends
depends on SERVER-44954 Use tailable awaitable isMaster Closed
Duplicate
is duplicated by SERVER-47971 Awaitable isMaster commands are logge... Closed
Related
related to SERVER-50242 slow query message seen in mongos for... Closed
related to SERVER-48393 Exclude exhaust isMaster commands fro... Closed
is related to SERVER-53197 Awaitable hello/isMaster appears in s... Closed
Backwards Compatibility: Fully Compatible
Backport Requested:
v4.4
Sprint: Repl 2020-04-20
Participants:
Case:

 Description   

With the new streaming isMaster protocol, it seems that isMaster replies are logged as slow queries. Is this intentional or shall we suppress these messages?



 Comments   
Comment by Githook User [ 01/May/20 ]

Author:

{'name': 'A. Jesse Jiryu Davis', 'email': 'jesse@mongodb.com', 'username': 'ajdavis'}

Message: SERVER-47267 Don't log awaitable isMaster as slow query

(cherry picked from commit 7dcdf00a34e6b030d63001c7586cbc161ef5229a)
Branch: v4.4
https://github.com/mongodb/mongo/commit/3b9eec568a152955bae821a4ff8974891bbb939a

Comment by Githook User [ 17/Apr/20 ]

Author:

{'name': 'A. Jesse Jiryu Davis', 'email': 'jesse@mongodb.com', 'username': 'ajdavis'}

Message: SERVER-47267 Don't log awaitable isMaster as slow query
Branch: master
https://github.com/mongodb/mongo/commit/7dcdf00a34e6b030d63001c7586cbc161ef5229a

Comment by A. Jesse Jiryu Davis [ 06/Apr/20 ]

Waiting for SERVER-44954 so I can use the constant kMaxAwaitTimeMS in the code for this change.

Comment by A. Jesse Jiryu Davis [ 04/Apr/20 ]

There's no existing mechanism either to set an individual command's slow query threshold, nor is there one for suppressing slow query logging. Since I have to build something new anyway, I'll follow the suggestion of specifically logging slow queries for isMaster if it takes 2x the expected max (2 * 10 seconds).

Comment by Bruce Lucas (Inactive) [ 03/Apr/20 ]

OK, yeah I guess that's too much.

Agreed it's simpler and info would be less useful, but, still, things can go wrong... If there's a mechanism already in place to set the logging threshold individually for this command then it might be useful to set it to say 2x max expected, i.e. 20 seconds. Otherwise I think it would be reasonable to suppress it - is there a mechanism for suppressing slow query logging for individual commands?

Comment by A. Jesse Jiryu Davis [ 03/Apr/20 ]

Yes, every 10 seconds for every connection. And when there's a topology change (such as a reconfig or stepdown) then there will be a message for every client monitoring connection that has been waiting longer than the slow query threshold since the last reply, which will be approximately all of them.

Compared to tailable cursors, there are a much larger number of monitoring connections that are waiting for isMaster replies. I also think performance information about isMaster is much less useful compared to performance information about tailable cursors - isMaster is much simpler and doesn't involve the query or storage systems.

Comment by Bruce Lucas (Inactive) [ 03/Apr/20 ]

I believe tailable cursors, e.g. changestreams, do log the wait time, and this has I believe been useful information.

In this case I think it could also be useful - for example, suppose something goes wrong and the isMaster reply takes 20 seconds instead of 10 - that I think would be useful information.

However I do see the point about a large number of them - will this be every 10 seconds for every connection, or something else?

Comment by A. Jesse Jiryu Davis [ 03/Apr/20 ]

Right, it's the time the server waited before replying to the client's isMaster request. I think it shouldn't be logged as a slow query, however. The server is supposed to wait up to 10 seconds before replying, if there has been no topology change in the meantime, so the long duration doesn't indicate a problem. This will produce a large number of log messages which I don't think are useful to anyone.

Do tailable cursors (capped collections or change streams) write slow query messages if they wait for new documents for longer than the slow query threshold?

Comment by Bruce Lucas (Inactive) [ 03/Apr/20 ]

What does the time reported represent? If it's duration between some request from a client and a response, i.e. this is a delay that would be experienced by a client, then they would seem to be appropriate.

Generated at Thu Feb 08 05:13:43 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.