[SERVER-29879] Provide method for counting orphaned documents Created: 27/Jun/17  Updated: 29/Jul/17  Resolved: 27/Jun/17

Status: Closed
Project: Core Server
Component/s: Sharding
Affects Version/s: None
Fix Version/s: None

Type: Improvement Priority: Major - P3
Reporter: Kevin Arhelger Assignee: Kelsey Schubert
Resolution: Duplicate Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Duplicate
duplicates SERVER-17013 Add 'dry run' mode for cleanupOrphaned Closed
Participants:
Case:

 Description   

Currently, there is not a good way for counting orphaned documents in a sharded cluster.

The current approach is to run a

db.collection.find({ShardKey:{$gte: MinKey, $lte: MaxKey}},{ShardKey:1,_id:0}).itcount()

and compare this to the sum of shard counts individually.

  • This query requires a full index scan, streaming the entire results to the shell. This can take significant time to complete on a large sharded cluster.
  • On a busy system these numbers could be off significantly due to inserts and deletes interleaving during the count.
  • Requires multiple queries to multiple hosts, leading to timing errors.

Since the logic already exists to cleanup orphans, a similar command (or parameter to the existing cleanupOrphaned) to count them would be useful for determining the potential impact of orphans on a given sharded cluster.



 Comments   
Comment by Kevin Pulo [ 28/Jun/17 ]

A better workaround for counting orphans is to look at the chunkSkips field in the SHARDING_FILTER stage of explain(true) output. (This field is nChunkSkips in older explain outputs.) This has the advantages of:

  • requires only a scan of the shard key index (across all shards in parallel),
  • returns only a small amount of info to the client, and
  • returns both the total number of index entries, and the number skipped due to being orphans, for the one single scan of the index (which is non-atomic, but still better than doing two separate scans).

Count all orphans (shard key of { shard: 1, key: 1 })

mongos> db.shardedCollection.find( { }, { shard: 1, key: 1, _id: 0 } ).hint( { shard: 1, key: 1 } ).explain(true)

Count orphans in a given chunk or shard key range (shard key of { shard: 1, key: 1 })

// This mostly matters for compound shard keys, where using {{$gte}}/{{$lt}} isn't quite right.
mongos> db.shardedCollection.find( { }, { shard: 1, key: 1, _id: 0 } ).hint( { shard: 1, key: 1 } ).min( { shard: ..., key: ... } ).max( { shard: ..., key: ... } ).explain(true)


Demo setup

kev@basique:~$ mlaunch init --sharded 2 --single --port 12345
launching: mongod on port 12346
launching: mongod on port 12347
launching: config server on port 12348
replica set 'configRepl' initialized.
launching: mongos on port 12345
adding shards.
kev@basique:~$ mlaunch list
 
PROCESS          PORT     STATUS     PID
 
mongos           12345    running    17184
 
config server    12348    running    17065
 
shard01
    single       12346    running    17007
 
shard02
    single       12347    running    17036
 
kev@basique:~$ mongo --port 12345
MongoDB shell version v3.4.4
connecting to: mongodb://127.0.0.1:12345/
MongoDB server version: 3.4.4
connection to 127.0.0.1:12345, version 3.4.4
db: test
(127.0.0.1:12345/test)mongos
> sh.enableSharding("test")
{ "ok" : 1 }
(127.0.0.1:12345/test)mongos
> sh.shardCollection("test.test", { a: 1 })
{ "collectionsharded" : "test.test", "ok" : 1 }
(127.0.0.1:12345/test)mongos
> sh.splitAt("test.test", { a: 0 })
{ "ok" : 1 }
(127.0.0.1:12345/test)mongos
> for (i=-10;i<10;i++) db.test.insert({a:i})
WriteResult({ "nInserted" : 1 })
(127.0.0.1:12345/test)mongos
> sh.status()
--- Sharding Status ---
  sharding version: {
        "_id" : 1,
        "minCompatibleVersion" : 5,
        "currentVersion" : 6,
        "clusterId" : ObjectId("5952f4cd9c942d127ec2a3f4")
}
  shards:
        {  "_id" : "shard01",  "host" : "basique:12346",  "state" : 1 }
        {  "_id" : "shard02",  "host" : "basique:12347",  "state" : 1 }
  active mongoses:
        "3.4.4" : 1
 autosplit:
        Currently enabled: yes
  balancer:
        Currently enabled:  yes
        Currently running:  no
                Balancer lock taken at Wed Jun 28 2017 10:14:05 GMT+1000 (AEST) by ConfigServer:Balancer
        Failed balancer rounds in last 5 attempts:  0
        Migration Results for the last 24 hours:
                1 : Success
  databases:
        {  "_id" : "test",  "primary" : "shard01",  "partitioned" : true }
                test.test
                        shard key: { "a" : 1 }
                        unique: false
                        balancing: true
                        chunks:
                                shard01 1
                                shard02 1
                        { "a" : { "$minKey" : 1 } } -->> { "a" : 0 } on : shard02 Timestamp(2, 0)
                        { "a" : 0 } -->> { "a" : { "$maxKey" : 1 } } on : shard01 Timestamp(2, 1)
 
(127.0.0.1:12345/test)mongos
>
bye
kev@basique:~$ mongo --port 12346
MongoDB shell version v3.4.4
connecting to: mongodb://127.0.0.1:12346/
MongoDB server version: 3.4.4
connection to 127.0.0.1:12346, version 3.4.4
db: test
(127.0.0.1:12346/test)
> db.test.insert( [ { a: -3 }, { a: -7 } ] )
BulkWriteResult({
        "writeErrors" : [ ],
        "writeConcernErrors" : [ ],
        "nInserted" : 2,
        "nUpserted" : 0,
        "nMatched" : 0,
        "nModified" : 0,
        "nRemoved" : 0,
        "upserted" : [ ]
})
(127.0.0.1:12346/test)
> db.test.find()
{ "_id" : ObjectId("5952f525adfd75fe99b2ae2e"), "a" : 0 }
{ "_id" : ObjectId("5952f525adfd75fe99b2ae2f"), "a" : 1 }
{ "_id" : ObjectId("5952f525adfd75fe99b2ae30"), "a" : 2 }
{ "_id" : ObjectId("5952f525adfd75fe99b2ae31"), "a" : 3 }
{ "_id" : ObjectId("5952f525adfd75fe99b2ae32"), "a" : 4 }
{ "_id" : ObjectId("5952f525adfd75fe99b2ae33"), "a" : 5 }
{ "_id" : ObjectId("5952f525adfd75fe99b2ae34"), "a" : 6 }
{ "_id" : ObjectId("5952f525adfd75fe99b2ae35"), "a" : 7 }
{ "_id" : ObjectId("5952f525adfd75fe99b2ae36"), "a" : 8 }
{ "_id" : ObjectId("5952f525adfd75fe99b2ae37"), "a" : 9 }
{ "_id" : ObjectId("5952f550f8e0fb607eeaf0c8"), "a" : -3 }
{ "_id" : ObjectId("5952f550f8e0fb607eeaf0c9"), "a" : -7 }
(127.0.0.1:12346/test)
>
bye
kev@basique:~$ mongo --port 12347
MongoDB shell version v3.4.4
connecting to: mongodb://127.0.0.1:12347/
MongoDB server version: 3.4.4
connection to 127.0.0.1:12347, version 3.4.4
db: test
(127.0.0.1:12347/test)
> db.test.insert( [ { a: 3 }, { a: 4 }, { a: 5 }, { a: 6 }, { a: 7 } ] )
BulkWriteResult({
        "writeErrors" : [ ],
        "writeConcernErrors" : [ ],
        "nInserted" : 5,
        "nUpserted" : 0,
        "nMatched" : 0,
        "nModified" : 0,
        "nRemoved" : 0,
        "upserted" : [ ]
})
(127.0.0.1:12347/test)
> db.test.find()
{ "_id" : ObjectId("5952f525adfd75fe99b2ae24"), "a" : -10 }
{ "_id" : ObjectId("5952f525adfd75fe99b2ae25"), "a" : -9 }
{ "_id" : ObjectId("5952f525adfd75fe99b2ae26"), "a" : -8 }
{ "_id" : ObjectId("5952f525adfd75fe99b2ae27"), "a" : -7 }
{ "_id" : ObjectId("5952f525adfd75fe99b2ae28"), "a" : -6 }
{ "_id" : ObjectId("5952f525adfd75fe99b2ae29"), "a" : -5 }
{ "_id" : ObjectId("5952f525adfd75fe99b2ae2a"), "a" : -4 }
{ "_id" : ObjectId("5952f525adfd75fe99b2ae2b"), "a" : -3 }
{ "_id" : ObjectId("5952f525adfd75fe99b2ae2c"), "a" : -2 }
{ "_id" : ObjectId("5952f525adfd75fe99b2ae2d"), "a" : -1 }
{ "_id" : ObjectId("5952f56cf1d035751f9c5cbe"), "a" : 3 }
{ "_id" : ObjectId("5952f56cf1d035751f9c5cbf"), "a" : 4 }
{ "_id" : ObjectId("5952f56cf1d035751f9c5cc0"), "a" : 5 }
{ "_id" : ObjectId("5952f56cf1d035751f9c5cc1"), "a" : 6 }
{ "_id" : ObjectId("5952f56cf1d035751f9c5cc2"), "a" : 7 }
(127.0.0.1:12347/test)
>
bye

Demo: Count all orphans

kev@basique:~$ mongo --port 12345
MongoDB shell version v3.4.4
connecting to: mongodb://127.0.0.1:12345/
MongoDB server version: 3.4.4
connection to 127.0.0.1:12345, version 3.4.4
db: test
(127.0.0.1:12345/test)mongos
> db.test.find( { }, { a: 1, _id: 0 } ).hint( { a: 1 } ).explain(true)
{
        "queryPlanner" : {
                ...
        },
        "executionStats" : {
                "nReturned" : 20,
                "executionTimeMillis" : 0,
                "totalKeysExamined" : 27,
                "totalDocsExamined" : 0,
                "executionStages" : {
                        "stage" : "SHARD_MERGE",
                        "nReturned" : 20,
                        "executionTimeMillis" : 0,
                        "totalKeysExamined" : 27,
                        "totalDocsExamined" : 0,
                        "totalChildMillis" : NumberLong(0),
                        "shards" : [
                                {
                                        "shardName" : "shard01",
                                        "executionSuccess" : true,
                                        "executionStages" : {
                                                "stage" : "PROJECTION",
                                                "nReturned" : 10,
                                                "executionTimeMillisEstimate" : 0,
                                                "works" : 13,
                                                "advanced" : 10,
                                                "needTime" : 2,
                                                "needYield" : 0,
                                                "saveState" : 0,
                                                "restoreState" : 0,
                                                "isEOF" : 1,
                                                "invalidates" : 0,
                                                "transformBy" : {
                                                        "a" : 1,
                                                        "_id" : 0
                                                },
                                                "inputStage" : {
                                                        "stage" : "SHARDING_FILTER",
                                                        "nReturned" : 10,
                                                        "executionTimeMillisEstimate" : 0,
                                                        "works" : 13,
                                                        "advanced" : 10,
                                                        "needTime" : 2,
                                                        "needYield" : 0,
                                                        "saveState" : 0,
                                                        "restoreState" : 0,
                                                        "isEOF" : 1,
                                                        "invalidates" : 0,
                                                        "chunkSkips" : 2,
                                                        "inputStage" : {
                                                                "stage" : "IXSCAN",
                                                                "nReturned" : 12,
                                                                "executionTimeMillisEstimate" : 0,
                                                                "works" : 13,
                                                                "advanced" : 12,
                                                                "needTime" : 0,
                                                                "needYield" : 0,
                                                                "saveState" : 0,
                                                                "restoreState" : 0,
                                                                "isEOF" : 1,
                                                                "invalidates" : 0,
                                                                "keyPattern" : {
                                                                        "a" : 1
                                                                },
                                                                "indexName" : "a_1",
                                                                "isMultiKey" : false,
                                                                "multiKeyPaths" : {
                                                                        "a" : [ ]
                                                                },
                                                                "isUnique" : false,
                                                                "isSparse" : false,
                                                                "isPartial" : false,
                                                                "indexVersion" : 2,
                                                                "direction" : "forward",
                                                                "indexBounds" : {
                                                                        "a" : [
                                                                                "[MinKey, MaxKey]"
                                                                        ]
                                                                },
                                                                "keysExamined" : 12,
                                                                "seeks" : 1,
                                                                "dupsTested" : 0,
                                                                "dupsDropped" : 0,
                                                                "seenInvalidated" : 0
                                                        }
                                                }
                                        }
                                },
                                {
                                        "shardName" : "shard02",
                                        "executionSuccess" : true,
                                        "executionStages" : {
                                                "stage" : "PROJECTION",
                                                "nReturned" : 10,
                                                "executionTimeMillisEstimate" : 0,
                                                "works" : 16,
                                                "advanced" : 10,
                                                "needTime" : 5,
                                                "needYield" : 0,
                                                "saveState" : 0,
                                                "restoreState" : 0,
                                                "isEOF" : 1,
                                                "invalidates" : 0,
                                                "transformBy" : {
                                                        "a" : 1,
                                                        "_id" : 0
                                                },
                                                "inputStage" : {
                                                        "stage" : "SHARDING_FILTER",
                                                        "nReturned" : 10,
                                                        "executionTimeMillisEstimate" : 0,
                                                        "works" : 16,
                                                        "advanced" : 10,
                                                        "needTime" : 5,
                                                        "needYield" : 0,
                                                        "saveState" : 0,
                                                        "restoreState" : 0,
                                                        "isEOF" : 1,
                                                        "invalidates" : 0,
                                                        "chunkSkips" : 5,
                                                        "inputStage" : {
                                                                "stage" : "IXSCAN",
                                                                "nReturned" : 15,
                                                                "executionTimeMillisEstimate" : 0,
                                                                "works" : 16,
                                                                "advanced" : 15,
                                                                "needTime" : 0,
                                                                "needYield" : 0,
                                                                "saveState" : 0,
                                                                "restoreState" : 0,
                                                                "isEOF" : 1,
                                                                "invalidates" : 0,
                                                                "keyPattern" : {
                                                                        "a" : 1
                                                                },
                                                                "indexName" : "a_1",
                                                                "isMultiKey" : false,
                                                                "multiKeyPaths" : {
                                                                        "a" : [ ]
                                                                },
                                                                "isUnique" : false,
                                                                "isSparse" : false,
                                                                "isPartial" : false,
                                                                "indexVersion" : 2,
                                                                "direction" : "forward",
                                                                "indexBounds" : {
                                                                        "a" : [
                                                                                "[MinKey, MaxKey]"
                                                                        ]
                                                                },
                                                                "keysExamined" : 15,
                                                                "seeks" : 1,
                                                                "dupsTested" : 0,
                                                                "dupsDropped" : 0,
                                                                "seenInvalidated" : 0
                                                        }
                                                }
                                        }
                                }
                        ]
                },
                "allPlansExecution" : [
                        {
                                "shardName" : "shard01",
                                "allPlans" : [ ]
                        },
                        {
                                "shardName" : "shard02",
                                "allPlans" : [ ]
                        }
                ]
        },
        "ok" : 1
}

Demo: Count orphans in a given chunk or shard key range

(127.0.0.1:12345/test)mongos
> db.test.find( { }, { a: 1, _id: 0 } ).hint( { a: 1 } ).min( { a: 5 } ).max( { a: 7 } ).explain(true)
{
        "queryPlanner" : {
                ...
        },
        "executionStats" : {
                "nReturned" : 2,
                "executionTimeMillis" : 1,
                "totalKeysExamined" : 4,
                "totalDocsExamined" : 0,
                "executionStages" : {
                        "stage" : "SHARD_MERGE",
                        "nReturned" : 2,
                        "executionTimeMillis" : 1,
                        "totalKeysExamined" : 4,
                        "totalDocsExamined" : 0,
                        "totalChildMillis" : NumberLong(0),
                        "shards" : [
                                {
                                        "shardName" : "shard01",
                                        "executionSuccess" : true,
                                        "executionStages" : {
                                                "stage" : "PROJECTION",
                                                "nReturned" : 2,
                                                "executionTimeMillisEstimate" : 0,
                                                "works" : 3,
                                                "advanced" : 2,
                                                "needTime" : 0,
                                                "needYield" : 0,
                                                "saveState" : 0,
                                                "restoreState" : 0,
                                                "isEOF" : 1,
                                                "invalidates" : 0,
                                                "transformBy" : {
                                                        "a" : 1,
                                                        "_id" : 0
                                                },
                                                "inputStage" : {
                                                        "stage" : "SHARDING_FILTER",
                                                        "nReturned" : 2,
                                                        "executionTimeMillisEstimate" : 0,
                                                        "works" : 3,
                                                        "advanced" : 2,
                                                        "needTime" : 0,
                                                        "needYield" : 0,
                                                        "saveState" : 0,
                                                        "restoreState" : 0,
                                                        "isEOF" : 1,
                                                        "invalidates" : 0,
                                                        "chunkSkips" : 0,
                                                        "inputStage" : {
                                                                "stage" : "IXSCAN",
                                                                "nReturned" : 2,
                                                                "executionTimeMillisEstimate" : 0,
                                                                "works" : 3,
                                                                "advanced" : 2,
                                                                "needTime" : 0,
                                                                "needYield" : 0,
                                                                "saveState" : 0,
                                                                "restoreState" : 0,
                                                                "isEOF" : 1,
                                                                "invalidates" : 0,
                                                                "keyPattern" : {
                                                                        "a" : 1
                                                                },
                                                                "indexName" : "a_1",
                                                                "isMultiKey" : false,
                                                                "multiKeyPaths" : {
                                                                        "a" : [ ]
                                                                },
                                                                "isUnique" : false,
                                                                "isSparse" : false,
                                                                "isPartial" : false,
                                                                "indexVersion" : 2,
                                                                "direction" : "forward",
                                                                "indexBounds" : {
 
                                                                },
                                                                "keysExamined" : 2,
                                                                "seeks" : 1,
                                                                "dupsTested" : 0,
                                                                "dupsDropped" : 0,
                                                                "seenInvalidated" : 0
                                                        }
                                                }
                                        }
                                },
                                {
                                        "shardName" : "shard02",
                                        "executionSuccess" : true,
                                        "executionStages" : {
                                                "stage" : "PROJECTION",
                                                "nReturned" : 0,
                                                "executionTimeMillisEstimate" : 0,
                                                "works" : 3,
                                                "advanced" : 0,
                                                "needTime" : 2,
                                                "needYield" : 0,
                                                "saveState" : 0,
                                                "restoreState" : 0,
                                                "isEOF" : 1,
                                                "invalidates" : 0,
                                                "transformBy" : {
                                                        "a" : 1,
                                                        "_id" : 0
                                                },
                                                "inputStage" : {
                                                        "stage" : "SHARDING_FILTER",
                                                        "nReturned" : 0,
                                                        "executionTimeMillisEstimate" : 0,
                                                        "works" : 3,
                                                        "advanced" : 0,
                                                        "needTime" : 2,
                                                        "needYield" : 0,
                                                        "saveState" : 0,
                                                        "restoreState" : 0,
                                                        "isEOF" : 1,
                                                        "invalidates" : 0,
                                                        "chunkSkips" : 2,
                                                        "inputStage" : {
                                                                "stage" : "IXSCAN",
                                                                "nReturned" : 2,
                                                                "executionTimeMillisEstimate" : 0,
                                                                "works" : 3,
                                                                "advanced" : 2,
                                                                "needTime" : 0,
                                                                "needYield" : 0,
                                                                "saveState" : 0,
                                                                "restoreState" : 0,
                                                                "isEOF" : 1,
                                                                "invalidates" : 0,
                                                                "keyPattern" : {
                                                                        "a" : 1
                                                                },
                                                                "indexName" : "a_1",
                                                                "isMultiKey" : false,
                                                                "multiKeyPaths" : {
                                                                        "a" : [ ]
                                                                },
                                                                "isUnique" : false,
                                                                "isSparse" : false,
                                                                "isPartial" : false,
                                                                "indexVersion" : 2,
                                                                "direction" : "forward",
                                                                "indexBounds" : {
 
                                                                },
                                                                "keysExamined" : 2,
                                                                "seeks" : 1,
                                                                "dupsTested" : 0,
                                                                "dupsDropped" : 0,
                                                                "seenInvalidated" : 0
                                                        }
                                                }
                                        }
                                }
                        ]
                },
                "allPlansExecution" : [
                        {
                                "shardName" : "shard01",
                                "allPlans" : [ ]
                        },
                        {
                                "shardName" : "shard02",
                                "allPlans" : [ ]
                        }
                ]
        },
        "ok" : 1
}

Comment by Kevin Arhelger [ 27/Jun/17 ]

Thanks anonymous.user,

This looks exactly like what I'm looking for. Feel free to close this as a dupe.

Comment by Kelsey Schubert [ 27/Jun/17 ]

Hi kevin.arhelger,

Would the work described in SERVER-17013 provide the functionality you're looking for?

Thanks,
Thomas

Generated at Thu Feb 08 04:22:03 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.