Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-48307

Transactions that write to exactly one shard and read from one or more other shards may incorrectly indicate failure on retry after successful commit

    XMLWordPrintable

    Details

    • Type: Task
    • Status: Closed
    • Priority: Major - P3
    • Resolution: Fixed
    • Affects Version/s: 4.5.1, 4.2.7, 4.4.0-rc6
    • Fix Version/s: 4.2.8, 4.4.0-rc7, 4.7.0
    • Component/s: Sharding
    • Labels:
      None
    • Backwards Compatibility:
      Fully Compatible
    • Backport Requested:
      v4.4, v4.2
    • Sprint:
      Sharding 2020-06-01

      Description

       

      Issue Status as of May 25, 2020

      ISSUE SUMMARY
      Distributed transactions which write to document(s) on exactly one shard and read document(s) from at least one other shard may execute more than once in the presence of a primary failover.

      This issue does not affect multi-document transactions involving a single shard or that write to multiple shards.

      When a client attempts to commit a multi-document transaction, the driver receives one of the following responses to the commitTransaction command:

      1. The transaction has definitively committed.
      2. The transaction has definitively aborted.
      3. It is unknown whether the transaction has committed or aborted.

      Clients using either driver’s callback transactions API or driver’s core transactions API would automatically retry the commitTransaction command to learn the definitive result of the transaction.

      Due to a bug in the router, the driver may be wrongly told “the transaction has definitely aborted and its operation should be automatically retried in a new transaction,” when the transaction has successfully been committed. This bug manifests when the commitTransaction command must be retried to learn the definitive result of the transaction and a primary failover has occurred in the intervening time on one of the shards that document(s) were only read from.

      USER IMPACT
      Distributed transactions which write to document(s) on exactly one shard and read document(s) from at least one other shard may execute more than once.

      AFFECTED VERSIONS
      This affects 4.2.7 and earlier versions of 4.2.

      FIX VERSION
      The fix will be included in 4.4.0 and 4.2.8.

        Attachments

          Issue Links

            Activity

              People

              Assignee:
              esha.maharishi Esha Maharishi
              Reporter:
              esha.maharishi Esha Maharishi
              Participants:
              Votes:
              0 Vote for this issue
              Watchers:
              21 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved: