Loading...

XML

Word

Printable

JSON

Type: Bug
Resolution: Fixed
Priority: Critical - P2
Fix Version/s: 5.3.0, 5.1.2, 5.0.6, 5.2.0-rc4
Affects Version/s: 5.0.0, 5.2.0, 5.1.0
Component/s: Sharding
Labels:
None

Backwards Compatibility:
Fully Compatible
Operating System:
ALL
Backport Requested:

v5.2, v5.1, v5.0
Steps To Reproduce:
Hide

repro-62245.patch

./buildscripts/resmoke.py run --storageEngine=wiredTiger --storageEngineCacheSizeGB=.50 --suite=sharding jstests/sharding/recover_multiple_migrations_on_stepup.js --log=file
Show
repro-62245.patch ./buildscripts/resmoke.py run --storageEngine=wiredTiger --storageEngineCacheSizeGB=.50 --suite=sharding jstests/sharding/recover_multiple_migrations_on_stepup.js --log=file
Sprint:
Sharding EMEA 2021-12-27, Sharding EMEA 2022-01-10
Case:
CAR Domain/s:
None

Aha! Reference:
None
Tracking Level:
None
Risk Status:
None
Exec Notes:
None
Goal Name(s):
None
Goal Link:
None

Issue and status as of Dec 30, 2021

ISSUE DESCRIPTION AND IMPACT

This issue can cause unavailability of a shard in sharded clusters running MongoDB versions 5.0.0 - 5.0.5 and 5.1.0 - 5.1.1. Next versions are not affected.

The problem can potentially occur if all of the following conditions have been met at least once:

More than one sharded collection
Multiple migrations
Intense write workloads or hardware failures

Symptom of the bug: mongod process crashing upon step-up due to an invariant failure with the following message: "Upon step-up a second migration coordinator was found".

REMEDIATION AND WORKAROUNDS

Restart nodes of the shard as replica set
Double-check that at most one migration coordinator document does not have a definitive decision.
For each migration coordinator document with a definitive decision, double-check that range deletion tasks are consistent with migration coordinators (same range and collectionUUID, if present):
- Aborted decision:
  — No range deletion document on donor
  — Zero or one ready range deletion document on recipient
- Committed decision:
  — Zero or one ready range deletion document on donor
  — No range deletion document on recipient
- No decision:
  — One pending range deletion tasks on donor
  — One pending range deletion tasks on recipient
Majority-delete all migration coordinators with a definitive decision
Restart nodes as shard

TECHNICAL DETAILS

Migration coordinators:

Documents persisted locally on shards in the internal collection config.migrationCoordinators
The structure of migration coordinator documents can be found here.

Range deletion tasks:

Documents persisted locally on shards in the internal collection config.rangeDeletions

The structure of range deletion task documents can be found here

--- Original ticket description ---

There are several situations that can lead to more than one migration (for different collections) needing recovery on stepup. For example, when a migration fails here we only clear the collection's filtering metadata so that the next access to the collection will trigger the recovery, and then release the ActiveMigrationRegistry. At this point, nothing prevents a migration to a different collection from starting, so now if the shard stepped down it would have two migrations to recover.

This invariant along with taking the MigrationBlockingGuard on stepup migration recovery was added on ~~SERVER-50174~~. It was meant to prevent migrations to different collections before the unfinished migrations found on stepup are recovered. However, as described above, situations where there are multiple migrations pending recovery are still possible in non-stepping situations.

The fact that a different migration (to another collection) starts using the same lsid as the migration pending recovery should not be a problem. The new migration will use a txnNumber that is two more than the previous migration. This will effectively be the same as advancing the txn number: It will prevent the first migration from using its original (lsid, txnNumber pair). The fact that a recovering migration gets a TransactionTooOld error when advancing the txnNumber on the recipient is not fully safe to ignore, because TransactionTooOld does not guarantee that a rollback can't occur, after which the original txnNumber could still be valid.

This ticket will provide a fix so that clusters that are already in the faulty situation of having several migrations pending to be recover don't hit the invariant on stepup anymore. ~~SERVER-62296~~ will avoid this faulty situation from happening again.

- - Sort By Name
  - Sort By Date
  - Ascending
  - Descending
  - Thumbnails
  - List
  - Download All

repro-62245.patch
5 kB
Dec 23 2021 03:11:13 PM UTC

is caused by

SERVER-50174 Multiple concurrent migration recoveries after step-up can race for the fixed Lsid/TxnNumber

Closed

is related to

SERVER-60521 Deadlock on stepup due to moveChunk command running uninterrupted on secondary

Closed

SERVER-62213 Investigate presence of multiple migration coordinator documents

Closed

SERVER-62243 Wait for vector clock document majority-commit without timeout

Closed

SERVER-62316 Remove the workaround for SERVER-62245 once 6.0 branches out

Closed

related to

SERVER-62296 MoveChunk should recover any unfinished migration before starting a new one

Closed

(1 related to)

Assignee:: Jordi Serra Torrens
Reporter:: Jordi Serra Torrens
Participants:: Githook User, Jordi Serra Torrens, Tommaso Tocci
Votes:: 0 Vote for this issue
Watchers:: 16 Start watching this issue

Created:: Dec 23 2021 02:44:40 PM UTC
Updated:: Nov 07 2024 02:12:52 PM UTC
Resolved:: Dec 30 2021 12:44:51 PM UTC

Details

Description

Attachments

Attachments

Issue Links

Activity

People

Dates