Concurrent movePrimary and removeShard can move database to a no-longer existent shard

XMLWordPrintableJSON

    • Type: Bug
    • Resolution: Fixed
    • Priority: Major - P3
    • 6.0.4, 6.2.0-rc5, 6.3.0-rc0
    • Affects Version/s: None
    • Component/s: None
    • None
    • Fully Compatible
    • ALL
    • v6.2, v6.0
    • Sharding EMEA 2022-11-14, Sharding EMEA 2022-11-28, Sharding EMEA 2022-12-12, Sharding EMEA 2022-12-26
    • None
    • 3
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      Consider the following interleaving:

      1. MovePrimary starts
      2. MovePrimary completes the cloning phase and is about to execute the commit phase.
      3. Concurrently, the user runs removeShard on the destination shard
      4. MovePrimary now commits the metadata change on the configsvr which writes the no-longer-existent destination shardId as the db primary for the moved database

      As a result, the moved database becomes inaccessible and data may be lost if the removed shard is destroyed.

              Assignee:
              Antonio Fuschetto
              Reporter:
              Jordi Serra Torrens
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

                Created:
                Updated:
                Resolved: