Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-42699

Server should perform majority write concern no-op write before returning TransientTransactionError label on commitTransaction cmd.

    • Type: Icon: Task Task
    • Resolution: Won't Fix
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: None
    • Component/s: Replication
    • Labels:
      None
    • Repl 2019-12-30

      For replica set transaction, we can get into double committing problem for below scenario (see SPEC-1185).

      1) The driver is connected to a replica set where node A is primary.
      2) The driver sends commitTransaction to A with w:1. A commits the transaction but dies before it can reply. This constitutes a retryable error, which means the driver can retry the commitTransaction command.
      3) Node B is briefly elected.
      4) The driver retries commitTransaction on B, and B replies with an error label TransientTransactionError (TTE). This implies that the driver may retry the entire transaction. (To be noted, driver always retry the commitTransaction with {w: majority}).
      5) Node A revives before B has done any w:majority writes and gets reelected as primary. The driver then retries the entire transaction on A where it commits successfully. This leads to double committing problem.

      Currently TTE label is attached to the following error codes.

      1. ErrorCodes::NoSuchTransaction
      2. ErrorCodes::WriteConflict
      3. ErrorCodes::LockTimeout
      4. ErrorCodes::PreparedTransactionInProgress
      5.  Snapshot Error
      6. Retargetting Error - ***Doesn't apply to replica set transaction*** (Correct me if I am wrong).

      But, only for ErrorCodes::NoSuchTransaction, we perform no-op writes w/ {w: majority} and check the write concern error to attach TTE label.

      Its safe now not to do majority write concern no-op writes for other error codes which can generate TTE label. Because, we currently perform below sequence of steps for commitTransaction cmd (for unprepared transactions/ single replica sets).
      Seq #1  If session option doesn't contain "startTransaction", do _continueMultiDocumentTransaction() .
      /** TTE error codes from 2- 5 is possible only from here onwards ****/
      Seq #2 Else, execute  _beginMultiDocumentTransaction() - starts a new transaction with next higher txn number.
      Seq #3 Unstash the transaction lock resources.
      Seq #4 Abort the unprepared transaction if seq#3 fails. Else proceed to seq #5.
      Seq #5 Run the (commit) command. 
      Seq #6  If step 3 fails, then abort the unprepared transaction. Else, stash the transaction lock resources only if they are not committed or aborted.
       

      To be noted, commitTxn cmd can't contain 'startTransaction'. So, it should always go to _continueMultiDocumentTransaction().
      According to me, there can be only 2 cases
      Case #1 Node B has already committed the transaction via secondary oplog application. And, then the driver retries the commitTransaction on node B.
         - This won't even go to unstash step (seq #3) as there is nothing to unstash. So, it's impossible to return TTE error codes. ( I am assuming that the sessions are all single threaded).
      Case #2 Node B has not received/applied the commitTransaction oplog entry. And, then the driver retries the commitTransaction on node B.
         - This can only generate NoSuchTransaction Error code.

      The system won't be safe, if we get any TTE label error codes from 2 thru 5 (ErrorCodes::WriteConflict to Snapshot error) before seq #1 beginOrContinue() (i.e. _continueMultiDocumentTransaction()).  This means we should perform no-op writes {w:majority} and check for majority write concern error before attaching the TTE label for all TTE error codes. Performing no-op writes always has a performance cost. So, instead we should try to strengthen the contract of TTE label itself.

      Note: For Cross-shard transaction, the txn coordinator doesn't care about TTE label attached to the commitTransaction cmd response.

            Assignee:
            tess.avitabile@mongodb.com Tess Avitabile (Inactive)
            Reporter:
            suganthi.mani@mongodb.com Suganthi Mani
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

              Created:
              Updated:
              Resolved: