Loading...

XML

Word

Printable

JSON

Type: Improvement
Resolution: Duplicate
Priority: Major - P3
Fix Version/s: None
Affects Version/s: None
Component/s: Sharding
Labels:
None

Case:
Confidence Status:
None
Work Order:
3

Aha! Reference:
None
Tracking Level:
None
Risk Status:
None
Exec Notes:
None
Goal Name(s):
None
Goal Link:
None

Currently, there is not a good way for counting orphaned documents in a sharded cluster.

The current approach is to run a

db.collection.find({ShardKey:{$gte: MinKey, $lte: MaxKey}},{ShardKey:1,_id:0}).itcount()

and compare this to the sum of shard counts individually.

This query requires a full index scan, streaming the entire results to the shell. This can take significant time to complete on a large sharded cluster.
On a busy system these numbers could be off significantly due to inserts and deletes interleaving during the count.
Requires multiple queries to multiple hosts, leading to timing errors.

Since the logic already exists to cleanup orphans, a similar command (or parameter to the existing cleanupOrphaned) to count them would be useful for determining the potential impact of orphans on a given sharded cluster.

duplicates

SERVER-17013 Add 'dry run' mode for cleanupOrphaned

Closed

Assignee:: Kelsey Schubert
Reporter:: Kevin Arhelger
Participants:: Kelsey Schubert, Kevin Arhelger, Kevin Pulo
Votes:: 0 Vote for this issue
Watchers:: 8 Start watching this issue

Created:: Jun 27 2017 08:08:24 PM UTC
Updated:: Jul 29 2017 04:23:00 PM UTC
Resolved:: Jun 27 2017 09:03:01 PM UTC

Details

Description

Attachments

Issue Links

Activity

People

Dates