Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-37881

Coordinator should time out waiting for prepare responses and decide to abort

    • Fully Compatible
    • Sharding 2018-12-31, Sharding 2019-01-14, Sharding 2019-01-28, Sharding 2019-02-11, Sharding 2019-02-25, Sharding 2019-03-11, Sharding 2019-03-25, Sharding 2019-04-08

      There are three stages in the lifetime of a TransactionCoordinator object:

      1. Created, but coordinateCommit command has not yet been received
      2. Prepare was sent, but decision has not yet been made, because no votes have been received from some participants
      3. Decision was made and commit was sent to participants, but confirmation has not yet been received from all

      Phases 1 and 2 can be cancelled (timed-out), but phase 3 can not. This ticket is about introducing an upper bound for how long phases 1 and 2 can take before the coordinator unilaterally decides that it must abort.

      The upper bound for phases 1 and 2 combined will be the same as the transactionLifetimeLimitSeconds parameter (which defaults to 1 minute). This means that if a commit is not received and/or decision cannot be made for transactionLifetimeLimitSeconds after the transaction has started, that transaction will abort.

      If a coordinateCommit command is received with maxTimeMS greater than what is left of transactionLifetimeLimitSeconds since the transaction started, the effective maxTimeMS of the coordinateCommit command will be what is left of transactionLifetimeLimitSeconds.

            kaloian.manassiev@mongodb.com Kaloian Manassiev
            matthew.saltz@mongodb.com Matthew Saltz (Inactive)
            0 Vote for this issue
            6 Start watching this issue