[SERVER-50144] Removing a shard with in-progress migration coordinators can leave permanently pending config.rangeDeletions document on recipient Created: 06/Aug/20  Updated: 26/Oct/23

Status: Backlog
Project: Core Server
Component/s: Sharding
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Matthew Saltz (Inactive) Assignee: Backlog - Catalog and Routing
Resolution: Unresolved Votes: 0
Labels: bkp, oldshardingemea
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Duplicate
is duplicated by SERVER-62691 Remove shard does not wait for migrat... Closed
Related
related to SERVER-50143 Removing a shard with in-progress tra... Backlog
related to SERVER-50146 Removing a shard with 'uncommitted' d... Backlog
related to SERVER-60767 Ensure that the execution of removeSh... Backlog
is related to SERVER-61055 Blacklist tests that call removeShard... Closed
Assigned Teams:
Catalog and Routing
Operating System: ALL
Sprint: Sharding 2020-11-02, Sharding 2020-11-16, Sharding 2020-11-30, Sharding 2020-12-14, Sharding 2020-12-28, Sharding 2021-01-11, Sharding EMEA 2021-10-04, Sharding EMEA 2021-10-18
Participants:
Linked BF Score: 143

 Description   

The following scenario can occur:

  1. Migration of a chunk from shard X to shard Y completes with commit/abort and the migration coordinator on shard X persists the decision
  2. Shard X is removed and shut down before the migration coordinator updates config.rangeDeletions on shard Y
  3. Shard Y is left with document in config.rangeDeletions corresponding to that migration with a 'pending: true' flag

The presence of this document will permanently prevent any migrations to shard Y for chunks overlapping this chunk.

In the case where the migration aborted after migrating some documents to shard Y, this will also leave documents in this chunk orphaned on shard Y.



 Comments   
Comment by Matthew Saltz (Inactive) [ 20/Aug/20 ]

This ticket should also re-enable checking for orphaned documents at the end of shard_removal_triggers_catalog_cache_invalidation.js, which will be removed as part of SERVER-49713

Generated at Thu Feb 08 05:21:51 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.