Details
-
Bug
-
Resolution: Done
-
Major - P3
-
None
-
5.0.5, 5.1.1
-
None
-
None
-
Sharding EMEA 2021-12-27
-
(copied to CRM)
Description
It has been observed on a cluster the presence of 4 migration coordinator documents on one shard that led to hit this invariant on step-up.
The documents were all relative to migrations for different namespaces and the states were:
- 2 aborted
- 1 committed
- 1 without decision
The range deletions seemed to have been correctly handled both on donor and recipients:
- No range deletion documents for the aborted migrations (range deletion tasks already executed)
- Ready range deletion task on the donor for the committed migration
- Pending range deletions on donor/receiver for the migration without decision
Given the state of "decided" migrations, we can consider that:
- _abortMigrationOnDonorAndRecipient worked well.
- _commitMigrationOnDonorAndRecipient worked well.
It is then very likely that something odd happened right after, as part of the call to forgetMigration that did not remove the migration coordinators.
Attachments
Issue Links
- related to
-
SERVER-62245 MigrationRecovery must not assume that only one migration needs to be recovered
-
- Closed
-
-
SERVER-62243 Wait for vector clock document majority-commit without timeout
-
- Closed
-