Wait for kBlockingWrites before returning from _shardsvrDonorCriticalSectionStarted

XMLWordPrintableJSON

    • Type: Bug
    • Resolution: Fixed
    • Priority: Major - P3
    • 9.0.0-rc0
    • Affects Version/s: None
    • Component/s: None
    • None
    • Cluster Scalability
    • Fully Compatible
    • ALL
    • ClusterScalability 25May-5June
    • 200
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      When featureFlagReshardingNoRefreshApplyingAndBlockingWrites is enabled (default on FCV 9.0), the resharding coordinator calls _shardsvrDonorCriticalSectionStarted on donor shards and waits for it to return before proceeding toward commit. The command handler waits on awaitCriticalSectionAcquired(), which resolves as soon as the critical section lock is acquired  — but before the donor writes its blocking-writes oplog entries and before  _transitionState(kBlockingWrites) completes.

      Recipients can consume donor oplog entries and reach strict-consistency, causing the coordinator to send _shardsvrCommitReshardCollection while the donor is still in preparing-to-block-writes. This results in a crash.

      Change the donor command to awaitInBlockingWritesOrError() to ensure that the coordinator does not commit until the donor has fully transitioned to kBlockingWrites.

            Assignee:
            Kruti Shah
            Reporter:
            Kruti Shah
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

              Created:
              Updated:
              Resolved: