Concurrent movePrimary and removeShard can move database to a no-longer existent shard

XMLWordPrintableJSON

    • Type: Bug
    • Resolution: Fixed
    • Priority: Major - P3
    • 6.0.4, 6.2.0-rc5, 6.3.0-rc0
    • Affects Version/s: None
    • Component/s: None
    • None
    • Fully Compatible
    • ALL
    • v6.2, v6.0
    • Sharding EMEA 2022-11-14, Sharding EMEA 2022-11-28, Sharding EMEA 2022-12-12, Sharding EMEA 2022-12-26
    • None
    • 3
    • None
    • None
    • None
    • None
    • None
    • None

      Consider the following interleaving:

      1. MovePrimary starts
      2. MovePrimary completes the cloning phase and is about to execute the commit phase.
      3. Concurrently, the user runs removeShard on the destination shard
      4. MovePrimary now commits the metadata change on the configsvr which writes the no-longer-existent destination shardId as the db primary for the moved database

      As a result, the moved database becomes inaccessible and data may be lost if the removed shard is destroyed.

            Assignee:
            Antonio Fuschetto
            Reporter:
            Jordi Serra Torrens
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

              Created:
              Updated:
              Resolved: