Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-62671

[Retryability] Handle committing and aborting prepared retryable internal transaction after failover

    XMLWordPrintableJSON

Details

    • Icon: Bug Bug
    • Resolution: Fixed
    • Icon: Major - P3 Major - P3
    • 5.3.0
    • None
    • None
    • None
    • Fully Compatible
    • ALL
    • Sharding 2022-01-24, Sharding 2022-02-07

    Description

      There are two known bugs related to resuming prepared retryable internal transactions after failover:

      1. When a new primary steps up, it resumes all prepared transactions without doing a refresh. So for a retryable internal transaction, the TransactionParticipant will have an empty p().activeTxnCommittedStatements after the transaction commits since the map is populated by onPreparedTransactionCommit() which doesn’t run on secondaries, plus secondaries don’t do addTransactionOperation() while applying the applyOps oplog entries for transactions. As a result, any retries with/without internal transactions will cause the write statements that were executed in that transaction to re-execute.
      2. When a new primary steps up, if there is a prepared retryable internal transaction, the node will hang in the step for refreshing the locks. The reason is that when it checks out the internal session, it will try to refresh the parent session (new behavior introduced in SERVER-62020) and hang because it cannot acquire the global IS lock with this side opCtx because the main opCtx is holding the RSTL lock.

      Attachments

        Activity

          People

            cheahuychou.mao@mongodb.com Cheahuychou Mao
            cheahuychou.mao@mongodb.com Cheahuychou Mao
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: