Loading...

XML

Word

Printable

JSON

Type: Bug
Resolution: Done
Priority: Major - P3
Fix Version/s: None
Affects Version/s: 5.0.5, 5.1.1
Component/s: None
Labels:
None

Sprint:
Sharding EMEA 2021-12-27
Case:
Confidence Status:
None
Work Order:
3
CAR Domain/s:
None

Aha! Reference:
None
Tracking Level:
None
Risk Status:
None
Exec Notes:
None
Goal Name(s):
None
Goal Link:
None

It has been observed on a cluster the presence of 4 migration coordinator documents on one shard that led to hit this invariant on step-up.

The documents were all relative to migrations for different namespaces and the states were:

2 aborted
1 committed
1 without decision

The range deletions seemed to have been correctly handled both on donor and recipients:

No range deletion documents for the aborted migrations (range deletion tasks already executed)
Ready range deletion task on the donor for the committed migration
Pending range deletions on donor/receiver for the migration without decision

Given the state of "decided" migrations, we can consider that:

_abortMigrationOnDonorAndRecipient worked well.
_commitMigrationOnDonorAndRecipient worked well.

It is then very likely that something odd happened right after, as part of the call to forgetMigration that did not remove the migration coordinators.

related to

SERVER-62245 MigrationRecovery must not assume that only one migration needs to be recovered

Closed

SERVER-62243 Wait for vector clock document majority-commit without timeout

Closed

Assignee:: Pierlauro Sciarelli
Reporter:: Pierlauro Sciarelli
Participants:: Pierlauro Sciarelli
Votes:: 0 Vote for this issue
Watchers:: 4 Start watching this issue

Created:: Dec 21 2021 06:49:25 PM UTC
Updated:: Oct 31 2024 10:23:30 AM UTC
Resolved:: Dec 23 2021 11:15:29 AM UTC

Details

Description

Attachments

Issue Links

Forms

Activity

People

Dates