[SERVER-17751] Make cleanupOrphaned cmd interruptible Created: 26/Mar/15  Updated: 15/Nov/17  Resolved: 15/Nov/17

Status: Closed
Project: Core Server
Component/s: Sharding
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Ljuba Nedeljkovic Assignee: Kaloian Manassiev
Resolution: Done Votes: 2
Labels: orphaned
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
depends on SERVER-27921 Add 'waitForDelete' functionality to ... Closed
Operating System: ALL
Sprint: Sharding 2017-12-04
Participants:

 Description   

Environment:

OS: CentOS 6.5/6.6
MongoDB: 2.6.4, compiled with ssl support

We are running sharded cluster with 6 replica sets and as per http://docs.mongodb.org/v2.6/reference/command/cleanupOrphaned/ we are executing following code to clean orphaned documents from sharded collections:

var nextKey = {};
var result;
 
while  ( nextKey != null ) {
  result = db.runCommand( { cleanupOrphaned: namespace, startingFromKey: nextKey } );
 
  if (result.ok != 1)
     print("Unable to complete at this time: failure or timeout.")
 
  printjson(result);
 
  nextKey = result.stoppedAtKey;
}

For some collections command works as expected. However, on some collections, command hangs waiting for open cursors. Log lines appearing every minute are given below.

2015-03-26T12:49:00.180+0100 [conn1736834] rangeDeleter waiting for open cursors in: DATABASE.COLLECTION, min: { _id: MinKey }, max: { _id: 3074457345618258600 }, elapsedSecs: 293637, cursors: [ 1788183098665 1788838028619 ]
2015-03-26T12:50:00.287+0100 [conn1736834] rangeDeleter waiting for open cursors in: DATABASE.COLLECTION, min: { _id: MinKey }, max: { _id: 3074457345618258600 }, elapsedSecs: 293697, cursors: [ 1788183098665 1788838028619 ]

As can be seen from the logging lines, same cursors are always listed, so is the max key. We are not aware of any other way to stop the command but to switch primary on every replica set. Since there is a lot of collections that are sharded and need to be cleaned of possible orphaned documents, switching primaries whenever clanupOrphaned hangs waiting for open cursors is not a viable solution.

Is there another way of stopping execution of cleanupOrphaned command? Or is there a way of closing open cursors from server side?



 Comments   
Comment by Kaloian Manassiev [ 15/Nov/17 ]

As of version 3.6 (and SERVER-27921 in particular), the cleanupOrphans command uses interruptible wait for the range deleter to complete, which means that it can be killed.

Comment by Kaloian Manassiev [ 28/Jul/17 ]

With the new range deleter implementation, this might just work, because we use the interruptible form of waiting on the background range deleter thread to complete. Putting it in 3.5 Desired so we can get back and validate this.

Comment by Randolph Tan [ 27/Mar/15 ]

Currently there is no way to stop the cleanupOrphaned command. You can forcefully close a cursor by issuing a killCursor against the cursor id. The mongo shell currently doesn't have this functionality exposed (see SERVER-5813), but you can use another driver that supports it.

Generated at Thu Feb 08 03:45:28 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.