Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-39726

Recovering the state of an uncommitted transaction should not block

    • Type: Icon: Bug Bug
    • Resolution: Duplicate
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: 4.1.8
    • Component/s: None
    • Labels:
    • ALL
    • Sharding 2019-04-08, Sharding 2019-04-22, Sharding 2019-05-06

      SERVER-37344's ticket description says:

      a shard that receives 'recoverTransaction' returns NoSuchTransaction if the shard does not know about the transaction. otherwise, if the decision has been made, returns the decision; *if the decision has not been made, decides to abort.*

      And the server design also says:

      If the client is unable to reach the original router after having attempted to send commitTransaction to the original router, the client can send commitTransaction to a different router.
      Doing so will never initiate committing the transaction. *Instead, the recovery token in the request will be used to try to abort the transaction if a decision to commit has not already been made*, otherwise to recover the transaction's outcome.

      However the implementation of SERVER-37344 says:

      commit recovery is best effort. If coordinateCommit was never sent to the coordinator, the recovery commit will timeout waiting for it.

      So I think the current implementation is incomplete. The abort optimization is important because it prevents applications from blocking for 60 seconds (or transactionLifetimeLimitSeconds) when the original commit attempt is lost.

      CC: renctan.

            randolph@mongodb.com Randolph Tan
            shane.harvey@mongodb.com Shane Harvey
            0 Vote for this issue
            6 Start watching this issue