Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-53506

Deal with the possibility of writes coming into the coordinator after the resharding operation has finished

    XMLWordPrintable

    Details

    • Backwards Compatibility:
      Fully Compatible
    • Sprint:
      Sharding 2021-03-08, Sharding 2021-03-22
    • Story Points:
      2

      Description

      It's possible that:

      1. The last shard sends to the coordinator a write indicating that it has finished the resharding operation.
      2. The write gets lost due to a network error.
      3. The shard sends the write a second time, which the coordinator applies, finishing the resharding operation.
      4. The first write from the shard finally gets sent, triggering the invariant that the resharding instance for that UUID still exists.

      Max Hirschhorn proposed two solutions:

      1. Have the donors and recipients use retryable writes so that the shard doesn't attempt the write again if it's already been applied, and
      2. Have the donors and recipients use a precondition that will make the write a no-op if the write has already been applied.

        Attachments

          Issue Links

            Activity

              People

              Assignee:
              yuhong.zhang Yuhong Zhang
              Reporter:
              blake.oler Blake Oler
              Participants:
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved: