[SERVER-62691] Remove shard does not wait for migrations to finish on the drained shard Created: 17/Jan/22  Updated: 06/Dec/22  Resolved: 18/Jan/22

Status: Closed
Project: Core Server
Component/s: Sharding
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Marcos José Grillo Ramirez Assignee: [DO NOT USE] Backlog - Sharding EMEA
Resolution: Duplicate Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Duplicate
duplicates SERVER-50144 Removing a shard with in-progress mig... Backlog
Assigned Teams:
Sharding EMEA
Operating System: ALL
Participants:

 Description   

The removeShard command is only checking locally in the config server if the removed shard does not own any more chunks, however, this check can pass right after the latest migration commits the chunk on the config server, but has not finished the cleanup, meaning that important persistency cleanup tasks like starting the donor shard range deletion, removing the recipient shard's range deletion document document and even removing the coordinator document, might never be executed if a user shuts down the shard immediately after receiving a successful result of a removeShard command.

Remove shard should check with the draining shard if all migrations are finished and successful.



 Comments   
Comment by Marcos José Grillo Ramirez [ 18/Jan/22 ]

esha.maharishi yes, it is the same as SERVER-50144, I'll mark it as a duplicate.

Comment by Esha Maharishi (Inactive) [ 18/Jan/22 ]

marcos.grillo, just a heads up that this may be a dupe of SERVER-50144 and/or SERVER-50146.

Generated at Thu Feb 08 05:55:49 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.