Removing a shard with 'uncommitted' documents in config.rangeDeletions on migration recipient can lead to incomplete state on donor

XMLWordPrintableJSON

    • Type: Bug
    • Resolution: Unresolved
    • Priority: Major - P3
    • None
    • Affects Version/s: None
    • Component/s: Sharding
    • None
    • Catalog and Routing
    • ALL
    • None
    • 3
    • None
    • None
    • None
    • None
    • None
    • None

      The following scenario can occur:

      1. Shard X migrates a chunk to shard Y and completes
      2. At some point before the donor deletes the config.rangeDeletions document on the recipient, shard Y migrates that same chunk to some other shard and then gets removed
      3. Shard X receives ShardNotFound for either of these commands on the recipient and never updates its local config.rangeDeletions document. This will repeat even after failover, leading to permanent orphans and the inability to migrate an overlapping chunk back to shard X

            Assignee:
            [DO NOT USE] Backlog - Catalog and Routing
            Reporter:
            Matthew Saltz (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

              Created:
              Updated: