[SERVER-50559] Don't log waiting change stream getMores as slow queries Created: 26/Aug/20  Updated: 29/Oct/23  Resolved: 05/Jul/21

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: 4.0.19, 4.4.1
Fix Version/s: 5.1.0-rc0

Type: Improvement Priority: Major - P3
Reporter: Jing Wu Assignee: Denis Grebennicov
Resolution: Fixed Votes: 0
Labels: qexec-team
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Problem/Incident
causes SERVER-58887 Change stream getMore lasted more tha... Closed
Backwards Compatibility: Minor Change
Sprint: Query 2020-11-30, Query Execution 2021-05-31, Query Execution 2021-06-14, Query Execution 2021-06-28, Query Execution 2021-07-12
Participants:

 Description   

This messages show up every second even there is no traffic form application to this mongos:  

2020-08-26T04:34:45.045+0000 I COMMAND [conn21283] command altconfig-b2b-perf.oplog command: getMore \{ getMore: 3513599116181216748, collection: "oplog", $db: "altconfig-b2b-perf", $clusterTime: { clusterTime: Timestamp(1598416483, 1), signature: { hash: BinData(0, EC3841EB1FB7A34F897688BB5983E32E2ADF6763), keyId: 6855385090000683010 } }, lsid: \{ id: UUID("72864d11-c0f8-48bb-823e-de5f18a9c409") } } originatingCommand: \{ find: "oplog", filter: { timestamp: { $gte: new Date(1597895390009) } }, tailable: true, awaitData: true, $db: "altconfig-b2b-perf", $clusterTime: \{ clusterTime: Timestamp(1597895487, 1), signature: { hash: BinData(0, AAAE33D5688F935C70469CBDB8EB1F6882749A40), keyId: 6855385090000683010 } }, lsid: \{ id: UUID("72864d11-c0f8-48bb-823e-de5f18a9c409") } } nShards:1 cursorid:3513599116181216748 numYields:0 nreturned:0 reslen:237 protocol:op_msg 1000ms
2020-08-26T04:34:46.046+0000 I COMMAND [conn21283] command altconfig-b2b-perf.oplog command: getMore \{ getMore: 3513599116181216748, collection: "oplog", $db: "altconfig-b2b-perf", $clusterTime: { clusterTime: Timestamp(1598416483, 1), signature: { hash: BinData(0, EC3841EB1FB7A34F897688BB5983E32E2ADF6763), keyId: 6855385090000683010 } }, lsid: \{ id: UUID("72864d11-c0f8-48bb-823e-de5f18a9c409") } } originatingCommand: \{ find: "oplog", filter: { timestamp: { $gte: new Date(1597895390009) } }, tailable: true, awaitData: true, $db: "altconfig-b2b-perf", $clusterTime: \{ clusterTime: Timestamp(1597895487, 1), signature: { hash: BinData(0, AAAE33D5688F935C70469CBDB8EB1F6882749A40), keyId: 6855385090000683010 } }, lsid: \{ id: UUID("72864d11-c0f8-48bb-823e-de5f18a9c409") } } nShards:1 cursorid:3513599116181216748 numYields:0 nreturned:0 reslen:237 protocol:op_msg 999ms

 
After we made change to use stream instead of tailable cursor, still get the messages:

2020-08-26T22:23:09.044+0000 I COMMAND [conn40485] command altconfig-b2b-perf.oplog command: getMore \{ getMore: 6960529238130512337, collection: "oplog", $db: "altconfig-b2b-perf", $clusterTime: { clusterTime: Timestamp(1598480586, 1), signature: { hash: BinData(0, 65DB7FE3D57B6415BAEDFBF72E74803E5C7D61C2), keyId: 6855385090000683010 } }, lsid: \{ id: UUID("30f97283-dbab-46e5-a142-42efb258f9b0") } } originatingCommand: \{ aggregate: "oplog", pipeline: [ { $changeStream: { fullDocument: "default" } } ], cursor: {}, $db: "altconfig-b2b-perf", $clusterTime: \{ clusterTime: Timestamp(1598416557, 4), signature: { hash: BinData(0, A10C3281F6948F187698FD220A217D02409A1912), keyId: 6855385090000683010 } }, lsid: \{ id: UUID("30f97283-dbab-46e5-a142-42efb258f9b0") } } nShards:1 cursorid:6960529238130512337 numYields:0 nreturned:0 reslen:344 protocol:op_msg 999ms
2020-08-26T22:23:10.044+0000 I COMMAND [conn40485] command altconfig-b2b-perf.oplog command: getMore \{ getMore: 6960529238130512337, collection: "oplog", $db: "altconfig-b2b-perf", $clusterTime: { clusterTime: Timestamp(1598480587, 1), signature: { hash: BinData(0, 7226F82A5EFECCCB9F212F9409359FFE1B6C9CD0), keyId: 6855385090000683010 } }, lsid: \{ id: UUID("30f97283-dbab-46e5-a142-42efb258f9b0") } } originatingCommand: \{ aggregate: "oplog", pipeline: [ { $changeStream: { fullDocument: "default" } } ], cursor: {}, $db: "altconfig-b2b-perf", $clusterTime: \{ clusterTime: Timestamp(1598416557, 4), signature: { hash: BinData(0, A10C3281F6948F187698FD220A217D02409A1912), keyId: 6855385090000683010 } }, lsid: \{ id: UUID("30f97283-dbab-46e5-a142-42efb258f9b0") } } nShards:1 cursorid:6960529238130512337 numYields:0 nreturned:0 reslen:344 protocol:op_msg 999ms

 
Is there a change in the 4.0 version? how can we stop this message?

Thanks,

Jing



 Comments   
Comment by Vivian Ge (Inactive) [ 06/Oct/21 ]

Updating the fixversion since branching activities occurred yesterday. This ticket will be in rc0 when it’s been triggered. For more active release information, please keep an eye on #server-release. Thank you!

Comment by Githook User [ 05/Jul/21 ]

Author:

{'name': 'Denis Grebennicov', 'email': 'denis.grebennicov@mongodb.com', 'username': 'denis631'}

Message: SERVER-50559 Don't log waiting change stream getMores as slow queries
Branch: master
https://github.com/mongodb/mongo/commit/b76f72590c66064f3fd69fc73c79636ce95566cf

Comment by Bernard Gorman [ 30/Nov/20 ]

SERVER-19649 implemented the desired logging behaviour for tailable awaitData cursors on mongoD in 3.2. The change that occurred in 4.0 is that we started logging slow queries on mongoS as well, which we hadn't done before (see SERVER-14900). It seems that we overlooked the special handling of awaitData cursors when expanding the logging infrastructure on mongoS, meaning that we still log these operations naively in cases where the timeout expires without seeing any new results.

I'm going to add this ticket to our upcoming Quick Wins bucket.

Comment by Eric Sedor [ 09/Sep/20 ]

Hi Jing, thanks for your patience.

Since these messages make idle mongos logs very noisy and aren't reporting operations that are slow for a problematic reason, it makes sense to avoid logging them if possible. I'm going to pass this on to an appropriate team for further consideration.

Currently, you can influence these messages if necessary, but only with side effects.

  • When you establish your change streams, provide a maxAwaitTimeMS option that is less than 100ms (the default slow logging threshold). This will result in more requests from the client to the server, however, as it will cause the client to poll the server more often.
  • You could also change the default logging threshold, but that would hide other slow operations you may want to see.
Comment by Jing Wu [ 09/Sep/20 ]

Eric,
Is there any update on this issue ?
Thanks,
Jing

Comment by Jing Wu [ 07/Sep/20 ]

Hi Eric,
any update?
Thanks,
Jing

Comment by Jing Wu [ 28/Aug/20 ]

Hi Eric,

Yes, the message is showing up every second in mongos.log.    Is it new behaviour in 4.0 mongos ?  How can I suppress such messages if it's not a bug ?

Thanks
Jing

Comment by Eric Sedor [ 27/Aug/20 ]

jing66wu@yahoo.com on the surface, this looks like the client is polling for new change events and finding none (which is why these log lines show nreturned:0).

Do I understand correctly that you have a client opening a changeStream on the "altconfig-b2b-perf.oplog" collection, and you are asking just about a difference in log behavior between 3.6 and 4.0?

Comment by Jing Wu [ 26/Aug/20 ]

The message is in mongos log.  

Generated at Thu Feb 08 05:22:59 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.