-
Type:
Bug
-
Resolution: Fixed
-
Priority:
Major - P3
-
Affects Version/s: None
-
Component/s: None
-
None
-
Cluster Scalability
-
Fully Compatible
-
ALL
-
ClusterScalability 25May-5June
-
200
-
None
-
None
-
None
-
None
-
None
-
None
-
None
When featureFlagReshardingNoRefreshApplyingAndBlockingWrites is enabled (default on FCV 9.0), the resharding coordinator calls _shardsvrDonorCriticalSectionStarted on donor shards and waits for it to return before proceeding toward commit. The command handler waits on awaitCriticalSectionAcquired(), which resolves as soon as the critical section lock is acquired — but before the donor writes its blocking-writes oplog entries and before _transitionState(kBlockingWrites) completes.
Recipients can consume donor oplog entries and reach strict-consistency, causing the coordinator to send _shardsvrCommitReshardCollection while the donor is still in preparing-to-block-writes. This results in a crash.
Change the donor command to awaitInBlockingWritesOrError() to ensure that the coordinator does not commit until the donor has fully transitioned to kBlockingWrites.