Resharding donors should abort in-progress unprepared transactions upon transitioning to "preparing-to-block-writes" to lower the chance of critical section timeout

XMLWordPrintableJSON

    • Cluster Scalability
    • ALL
    • ClusterScalability Jun23-Jul6
    • None
    • 3
    • TBD
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      On a resharding donor, acquiring critical section involves taking the exclusive S lock on the collection which cannot be done if there are in progress transactions involving that collection. That is, if there are in-progress transactions, the lock acquisition would block until those transactions commit or abort. Depending on how long the wait is, a donor might be unable to acquire the critical section and start blocking writes within in the critical section timeout, or might acquire the critical section too late such that the recipients cannot finish fetching and applying oplog entries within the critical section.

      To work around, we should make each resharding donor abort in-progress unprepared transactions upon entering the "preparing-to-block-writes" state, similarly to the setFCV command. This is best-effort only since prepared transactions cannot be aborted since by design a participant in a cross-shard transaction cannot unilaterally decide to abort (or commit) a transaction. 

            Assignee:
            Cheahuychou Mao
            Reporter:
            Cheahuychou Mao
            Votes:
            0 Vote for this issue
            Watchers:
            8 Start watching this issue

              Created:
              Updated: