[SERVER-27590] Duplicate documents in multiple shards Created: 05/Jan/17 Updated: 27/Oct/23 Resolved: 06/Jan/17 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Sharding |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Question | Priority: | Major - P3 |
| Reporter: | Dharshan Rangegowda | Assignee: | Unassigned |
| Resolution: | Works as Designed | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||
| Participants: | |||||||||
| Description |
|
Hi, We have a sharded collection with hashed index on "_id" as the key. We started with 2 shards and added one more. However we are finding duplicate objects with the same _id on both shard-0 and and shard-2. We identified this by directly connecting to the primary of the shards. A few other observations Are these duplicates from a failed migration? If so how come mongod does not clean it up? |
| Comments |
| Comment by Kelsey Schubert [ 11/Jan/17 ] |
|
Please take a look For MongoDB-related support discussion please post on the mongodb-user group or Stack Overflow with the mongodb tag. If you have a recommendation to improve our documentation, please feel free to open a DOCS ticket describing the change here or by clicking the "report a problem link" on the lower right of any manual page. Kind regards, |
| Comment by Dharshan Rangegowda [ 06/Jan/17 ] |
|
Hi Kal, Does cleanupOrphaned command work for hash based sharding? The documentation does not say either way - it will be good to call it out. Also is there an equivalent method to display orphanedDocuments before we run cleanupOrphaned command? If not I would like to request for that. |
| Comment by Kaloian Manassiev [ 06/Jan/17 ] |
|
Like you correctly point out, these orphaned documents must have come from a failed migration (or failed cleanup). MongoS filters them out because it transmits additional information allowing shards to know what document ranges they own, which does not happen if you connect directly to the shard or if you use a secondary read preference.
Unfortunately currently the shards have no way of resuming a failed cleanup, which is something we are aware of. MongoDB supports the cleanupOrphaned command which can be run manually to delete these orphaned documents. Hope this helps. Best regards, |
| Comment by Dharshan Rangegowda [ 05/Jan/17 ] |
|
One more observation 1. If we run an aggregation on the shard (on primary) it doesn't find the duplicates. But if we run an aggregation on the shard with readpreference Secondary it finds these duplicate documents - so this might be another issue. |