Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-62245

MigrationRecovery must not assume that only one migration needs to be recovered

    • Type: Icon: Bug Bug
    • Resolution: Fixed
    • Priority: Icon: Critical - P2 Critical - P2
    • 5.3.0, 5.1.2, 5.0.6, 5.2.0-rc4
    • Affects Version/s: 5.0.0, 5.2.0, 5.1.0
    • Component/s: Sharding
    • Labels:
      None
    • Fully Compatible
    • ALL
    • v5.2, v5.1, v5.0
    • Hide

      repro-62245.patch

      ./buildscripts/resmoke.py run --storageEngine=wiredTiger --storageEngineCacheSizeGB=.50 --suite=sharding  jstests/sharding/recover_multiple_migrations_on_stepup.js --log=file
      
      Show
      repro-62245.patch ./buildscripts/resmoke.py run --storageEngine=wiredTiger --storageEngineCacheSizeGB=.50 --suite=sharding jstests/sharding/recover_multiple_migrations_on_stepup.js --log=file
    • Sharding EMEA 2021-12-27, Sharding EMEA 2022-01-10

      Issue and status as of Dec 30, 2021

      ISSUE DESCRIPTION AND IMPACT

      This issue can cause unavailability of a shard in sharded clusters running MongoDB versions 5.0.0 - 5.0.5 and 5.1.0 - 5.1.1. Next versions are not affected.

      The problem can potentially occur if all of the following conditions have been met at least once:

      • More than one sharded collection
      • Multiple migrations
      • Intense write workloads or hardware failures

      Symptom of the bug: mongod process crashing upon step-up due to an invariant failure with the following message: "Upon step-up a second migration coordinator was found".

      REMEDIATION AND WORKAROUNDS

      • Restart nodes of the shard as replica set
      • Double-check that at most one migration coordinator document does not have a definitive decision.
      • For each migration coordinator document with a definitive decision, double-check that range deletion tasks are consistent with migration coordinators (same range and collectionUUID, if present):
        • Aborted decision:
          — No range deletion document on donor
          — Zero or one ready range deletion document on recipient
        • Committed decision:
          — Zero or one ready range deletion document on donor
          — No range deletion document on recipient
        • No decision:
          — One pending range deletion tasks on donor
          — One pending range deletion tasks on recipient
      • Majority-delete all migration coordinators with a definitive decision
      • Restart nodes as shard

      TECHNICAL DETAILS

      Migration coordinators:

      • Documents persisted locally on shards in the internal collection config.migrationCoordinators 
      • The structure of migration coordinator documents can be found here.

      Range deletion tasks:

      • Documents persisted locally on shards in the internal collection config.rangeDeletions
      • The structure of range deletion task documents can be found here

       


      --- Original ticket description ---

      There are several situations that can lead to more than one migration (for different collections) needing recovery on stepup. For example, when a migration fails here we only clear the collection's filtering metadata so that the next access to the collection will trigger the recovery, and then release the ActiveMigrationRegistry. At this point, nothing prevents a migration to a different collection from starting, so now if the shard stepped down it would have two migrations to recover.

      This invariant along with taking the MigrationBlockingGuard on stepup migration recovery was added on SERVER-50174. It was meant to prevent migrations to different collections before the unfinished migrations found on stepup are recovered. However, as described above, situations where there are multiple migrations pending recovery are still possible in non-stepping situations.

      The fact that a different migration (to another collection) starts using the same lsid as the migration pending recovery should not be a problem. The new migration will use a txnNumber that is two more than the previous migration. This will effectively be the same as advancing the txn number: It will prevent the first migration from using its original (lsid, txnNumber pair). The fact that a recovering migration gets a TransactionTooOld error when advancing the txnNumber on the recipient is not fully safe to ignore, because TransactionTooOld does not guarantee that a rollback can't occur, after which the original txnNumber could still be valid.

      This ticket will provide a fix so that clusters that are already in the faulty situation of having several migrations pending to be recover don't hit the invariant on stepup anymore. SERVER-62296 will avoid this faulty situation from happening again.

            Assignee:
            jordi.serra-torrens@mongodb.com Jordi Serra Torrens
            Reporter:
            jordi.serra-torrens@mongodb.com Jordi Serra Torrens
            Votes:
            0 Vote for this issue
            Watchers:
            16 Start watching this issue

              Created:
              Updated:
              Resolved: