Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-40186

The logic in `auto_retry_transaction.js:withTxnAndAutoRetry` does not retry failed commits

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major - P3
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 4.1.11
    • Component/s: Sharding
    • Labels:
      None
    • Backwards Compatibility:
      Fully Compatible
    • Operating System:
      ALL
    • Sprint:
      Sharding 2019-03-25, Sharding 2019-04-08, Sharding 2019-04-22, Sharding 2019-05-06
    • Linked BF Score:
      5

      Description

      The multi_statement_transaction_kill_sessions_atomicity_isolation.js concurrency workload executes ordered updates in transactions using snapshot isolation and from time to time kills random sessions, finally validating that the transactions still committed in the correct order.

      Enabling this workload against a sharded cluster leads to failures which appear as if transactions committed out of order:

               Error: [[ ]] != [[
                {
                        "tid" : 9,
                        "iteration" : 14,
                        "numUpdated" : 2
                },
                {
                        "tid" : 8,
                        "iteration" : 6,
                        "numUpdated" : 3
                },
                {
                        "tid" : 3,
                        "iteration" : 4,
                        "numUpdated" : 5
                },
                {
                        "tid" : 9,
                        "iteration" : 14,
                        "numUpdated" : 2
                }
      

      The reason for these failures is not due to a server bug, but because interrupting a session running 2 phase commit on mongos, may still result in the transaction committing. As a result of this, because the test retries the entire transaction (with exactly the same parameters), the transaction ends up committing twice.

      Proposed fix

      The way to fix is would be to make withTxnAndAutoRetry retry just the commit, if it fails, similar to what the drivers spec requires, namely:

      commitTransaction is a retryable write command. Drivers MUST retry once after commitTransaction fails with a retryable error according to the Retryable Writes Specification, regardless of whether retryWrites is set on the MongoClient or not.

        Attachments

          Issue Links

            Activity

              People

              Assignee:
              jack.mulrow Jack Mulrow
              Reporter:
              kaloian.manassiev Kaloian Manassiev
              Participants:
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved: