featureFlagReshardingSkipCloningAndApplyingIfApplicable makes resharding critical section get acquired on a non-donor db primary shard before critical section is engaged by coordinator

XMLWordPrintableJSON

    • Type: Bug
    • Resolution: Unresolved
    • Priority: Major - P3
    • None
    • Affects Version/s: None
    • Component/s: None
    • None
    • Cluster Scalability
    • ALL
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      SERVER-91109 introduced a way to reduce the performance impact of resharding by making the primary shard skip cloning documents and applying oplog entries if it is not a direct recipient for the collection being resharded. The design overlooked the fact that on a shard that is neither a donor nor a direct recipient, the transition to “strict-consistency” involves acquiring the critical section to prepare for collection renaming (SERVER-53653). So the performance optimization unexpectedly leads to early critical section on the primary shard, which can cause misrouted writes to get blocked long before the critical section is officially engaged by the coordinator when resharding is about to commit. 

            Assignee:
            Unassigned
            Reporter:
            Cheahuychou Mao
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

              Created:
              Updated: