Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-70321

Collmod coordinator must not resume migrations on retriable errors

    • Type: Icon: Bug Bug
    • Resolution: Fixed
    • Priority: Icon: Major - P3 Major - P3
    • 6.3.0-rc0, 6.0.5
    • Affects Version/s: 6.0.2, 6.1.0-rc4, 6.2.0-rc0
    • Component/s: None
    • Fully Compatible
    • ALL
    • v6.2, v6.0
    • Sharding EMEA 2022-11-14, Sharding EMEA 2022-11-28, Sharding EMEA 2022-12-12, Sharding EMEA 2022-12-26, Sharding EMEA 2023-01-23, Sharding EMEA 2023-02-06
    • 120

      Collmod coordinator may resumes migrations after hitting a retriable error.

      This could lead to wrong execution scenario like the following:

      1. Collmod starts, stop migrations and enter the kUpdateConfig phase
      2. Hit a retriable error and unblocks migrations
      3. Attempt to re-execute the kUpdateConfig but this time with the migrations unblocked

       

      Keep in mind that we can't simply resume migrations on non-retriable error, in fact even after hitting a non-retriable error we can't guarantee that the coordinator won't be recovered and re-executed from a new primary node in case of stepdown.

        1. collmod_changes.diff
          17 kB
          Allison Easton

            Assignee:
            allison.easton@mongodb.com Allison Easton
            Reporter:
            tommaso.tocci@mongodb.com Tommaso Tocci
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

              Created:
              Updated:
              Resolved: