Currently when running cleanupOrphaned, the log output looks like this:
2015-08-31T16:50:54.395+1000 I SHARDING [conn127] remotely refreshing metadata for test.test based on current shard version 0|0||55e3f24ec97f895066a73485, current metadata version is 1|2||55e3f24ec97f895066a73485 2015-08-31T16:50:54.396+1000 I SHARDING [conn127] updating metadata for test.test from shard version 0|0||55e3f24ec97f895066a73485 to shard version 2|0||55e3f24ec97f895066a73485 2015-08-31T16:50:54.396+1000 I SHARDING [conn127] collection version was loaded at version 2|1||55e3f24ec97f895066a73485, took 0ms 2015-08-31T16:50:54.396+1000 I SHARDING [conn127] Deleter starting delete for: test.test from { _id: 0.0 } -> { _id: MaxKey }, with opId: 708054 2015-08-31T16:54:53.335+1000 I SHARDING [conn127] Helpers::removeRangeUnlocked time spent waiting for replication: 197408ms 2015-08-31T16:54:53.335+1000 I SHARDING [conn127] rangeDeleter deleted 100000 documents for test.test from { _id: 0.0 } -> { _id: MaxKey } 2015-08-31T16:54:53.335+1000 I COMMAND [conn127] command admin.$cmd command: cleanupOrphaned { cleanupOrphaned: "test.test", startingFromKey: { _id: MinKey } } keyUpdates:0 writeConflicts:0 numYields:0 reslen:121 locks:{ Global: { acquireCount: { r: 200006, w: 200002 } }, MMAPV1Journal: { acquireCount: { r: 2, w: 200002 }, acquireWaitCount: { w: 342 }, timeAcquiringMicros: { w: 669571 } }, Database: { acquireCount: { r: 2, w: 200002 } }, Collection: { acquireCount: { R: 2, W: 100002 } }, oplog: { acquireCount: { w: 100000 }, acquireWaitCount: { w: 10 }, timeAcquiringMicros: { w: 2056 } } } 238940ms
The problem is that none of the initial messages mention "cleanupOrphaned". This means that while the command is in progress (eg. trying to diagnose a slow running cleanupOrphaned command), the only indication in the log is the initial "Deleter starting delete for" message. (Yes, it will be in currentOp, but this ticket is about the logfile). Unfortunately this message is also output by the RangeDeleter in other contexts (ie. chunk migrations), so its presence does not necessarily indicate that the cleanupOrphaned command was used. In addition, inexperienced users/operators/engineers might not be aware that they need to search the logs for "Deleter", which is less obvious than "cleanupOrphaned".
It would be much better if cleanupOrphaned could log a message when it starts to make it clear that it's about to run a RangeDeleter.
This could also be considered a special case of SERVER-15408, and closed as a duplicate of that ticket — as long as that ticket gets updated to make it clear that "progress" also means that cleanupOrphaned should log when it starts, as well while it's working (which happens to be via RangeDeleter).
- related to
-
SERVER-15408 cleanupOrphaned should output progress
- Closed