Loading...

XML

Word

Printable

JSON

Type: Bug
Resolution: Fixed
Priority: Major - P3
Fix Version/s: 7.2.0-rc0, 7.0.6, 5.0.25, 6.0.14
Affects Version/s: None
Component/s: None
Labels:
None

Backwards Compatibility:
Fully Compatible
Operating System:
ALL
Backport Requested:

v7.2, v7.1, v7.0, v6.0, v5.0
Linked BF Score:
120
Confidence Status:
None
Work Order:
3
CAR Domain/s:
None

Aha! Reference:
None
Tracking Level:
None
Risk Status:
None
Exec Notes:
None
Goal Name(s):
None
Goal Link:
None

As seen in https://jira.mongodb.org/browse/BF-30264 – it is possible that while resharding is in progress, a recipient primary may step down and the step up process does not wait for the step down to complete. When resharding completes on the recipient, the recipient state document is deleted on the current primary and this deletion is then replicated on the secondaries. Since an earlier secondary was a primary, it has a stale ActiveInstance (because the step up did not wait for the step down to complete), its deletion of the state document triggers the instance's cleanup and that is when the invariant failure is hit because the task in the GuaranteedExecutor failed to run before deletion. To avoid such scenarios, ReshardingDataReplication must join the ReshardingOplogFetcher thread pool.

Assignee:: Nandini Bhartiya
Reporter:: Nandini Bhartiya
Participants:: Githook User, Nandini Bhartiya
Votes:: 0 Vote for this issue
Watchers:: 3 Start watching this issue

Created:: Oct 31 2023 04:45:47 PM UTC
Updated:: Jan 25 2024 05:45:21 PM UTC
Resolved:: Nov 02 2023 09:34:13 PM UTC
Confidence Status Last Update:: 02/Nov/23 9:32 PM

Details

Description

Attachments

Forms

Activity

People

Dates