[SERVER-45367] When the Range Deleter is waiting for the open cursors, list these cursors in the logs. Created: 06/Jan/20  Updated: 29/Oct/23  Resolved: 07/May/20

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: 4.0.19, 3.6.19

Type: Improvement Priority: Major - P3
Reporter: Linda Qin Assignee: Matthew Saltz (Inactive)
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Backports
Related
is related to SERVER-50182 Open cursors can block cleanupOrphaned Closed
Backwards Compatibility: Fully Compatible
Backport Requested:
v3.6
Sprint: Sharding 2020-05-04, Sharding 2020-05-18
Participants:

 Description   

In 3.4 and earlier, when the Range Deleter is waiting for the open cursors, it lists these cursors in the logs as below. In this case, we can just run the killCursors command to kill them, to unblock the range deleter (and chunk migration to this shard) when needed.

[RangeDeleter] waiting for open cursors before removing range [{ KEY: VALUE1}, { KEY: VALUE2}) in DATABASE.COLLECTION, elapsed secs: N, cursor ids: [XXX, ...]

However, in 3.6 and above, the message is different and doesn't include the cursor id(s):

Deletion of DATABASE.COLLECTION range [{ KEY: VALUE1 }, { KEY: VALUE2 }) will be scheduled after all possibly dependent queries finish

The listCursors command is only added since 4.2, so for 3.6 and 4.0, it's a bit difficult to locate these cursors. We may have to restart the application or step down the primary, in order to kill these cursors/unblock the chunk migration to this shard. This is more destructive than just killing the cursors if we know the cursor ids.

As such, if the Range Deleter is waiting for the open cursors, it would be nice to also list these cursor ids in MongoDB 3.6+.



 Comments   
Comment by Githook User [ 11/May/20 ]

Author:

{'name': 'Matthew Saltz', 'email': 'matthew.saltz@mongodb.com', 'username': 'saltzm'}

Message: SERVER-45367 Log open cursors for a namespace when submitting tasks to the range deleter

(cherry picked from commit 2c7a53f74cd6f588cca2ab968a9263d59c56684c)
Branch: v3.6
https://github.com/mongodb/mongo/commit/a434867901150513736daa876b362ab9f7a22195

Comment by Matthew Saltz (Inactive) [ 07/May/20 ]

linda.qin The final result was that a log line was added like

Deletion of test.move_chunk_open_cursors range [{ _id: MinKey }, { _id: 2500.0 }) will be scheduled after all possibly dependent queries finish. All open cursors for namespace: [65980132688,67964091105,8544736894419225988]

The list of open cursors contains all open cursors and not necessarily ones which the range deletion is blocking behind - same as the 3.4 behavior.

Comment by Githook User [ 07/May/20 ]

Author:

{'name': 'Matthew Saltz', 'email': 'matthew.saltz@mongodb.com', 'username': 'saltzm'}

Message: SERVER-45367 Log open cursors for a namespace when submitting tasks to the range deleter
Branch: v4.0
https://github.com/mongodb/mongo/commit/2c7a53f74cd6f588cca2ab968a9263d59c56684c

Comment by Linda Qin [ 14/Apr/20 ]

matthew.saltz yes, I was referring to $currentOp with idleCursors = true (SERVER-3090).

Comment by Matthew Saltz (Inactive) [ 14/Apr/20 ]

We've decided to just implement the old behavior that was present in 3.4

Comment by Matthew Saltz (Inactive) [ 14/Apr/20 ]

linda.qin when you referred to listCursors, were you referring to using $currentOp with idleCursors = true?

Comment by Matthew Saltz (Inactive) [ 26/Feb/20 ]

It looks like in 3.4 the list of cursors printed was actually all open cursors for the namespace at the time of queueing the range deletion task, and we used that list of cursors to decide that the range was ready for deletion. We don't currently use cursor ids directly anymore, instead having ongoing queries access an object that increments a reference count indicating how many queries could be possibly using a given range.

That being said, the logic for getting all active cursor IDs is pretty simple, so if we wanted to add that logging we could, even though it might not directly correspond to the exact cursors being waited on by the range deleter for a range. Would that be worth doing?

Also, given that the cursors listed are all open cursors on the collection, I'm wondering if there is another built-in way to find that information even before listCursors.

By the way, I don't see any documentation for listCursors here or on Google - is that supposed to be a documented command?

Generated at Thu Feb 08 05:08:35 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.