[SERVER-21179] Clean-up orphan chunk entries left from a failed shardCollection with initial split Created: 28/Oct/15  Updated: 06/Dec/22  Resolved: 03/May/18

Status: Closed
Project: Core Server
Component/s: Sharding
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Kaloian Manassiev Assignee: [DO NOT USE] Backlog - Sharding Team
Resolution: Duplicate Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Duplicate
duplicates SERVER-33973 Force cleanup of possibly remaining p... Closed
Related
Assigned Teams:
Sharding
Operating System: ALL
Participants:

 Description   

If a shardCollection command with initial chunk split fails, it will leave orphaned chunk entries, which will then prevent the shardCollection call from being retried.

Instead, if during a shardCollection call we discover that the collection is not marked as sharded, but there are existing chunks for it, these chunks should be cleaned up so the call can be retried.



 Comments   
Comment by Kevin Pulo [ 29/Oct/15 ]

I find the idea of clearing any existing chunks at shardCollection time to be scary. I'm concerned about users who may have lost a config.collections entry (a bad situation, but relatively easy to recover from) getting confused and accidentally re-running shardCollection, which then removes all the chunk documents, meaning they've lost all record of where their data is (disastrous and almost impossible to recover from).

SERVER-18787 means that users (currently) have very little visibility into any foreign/orphan/stray chunk documents.

Please can this be kept to only cleaning up stray chunks after failed shardCollection calls (which is the real problem here), and the existing behaviour of shardCollection discovering pre-existing chunks be left as an error?

Generated at Thu Feb 08 03:56:33 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.