Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-49044

Make AsyncRequestSender not retry remote command requests with startTransaction=true

    • Type: Icon: Bug Bug
    • Resolution: Fixed
    • Priority: Icon: Major - P3 Major - P3
    • 4.4.1, 4.7.0
    • Affects Version/s: None
    • Component/s: Sharding
    • Labels:
      None
    • Fully Compatible
    • ALL
    • v4.4
    • Sharding 2020-07-13, Sharding 2020-07-27
    • 26

      The AsyncRequestSender can retry a remote command request on retriable errors up to kMaxNumFailedHostRetryAttempts times. Based on BF-17754, it is not safe for the ARS to retry remote commands that are run inside transactions since it can lead to a crash in the following case:

      1. There is an unprepared transaction that gets aborted by the RstlKillOpThread on stepdown and the command fails with InterruptedDueToReplStateChange.
      2. The ARS retries the remote command request for that transaction statement (with startTransaction: true) against the new primary and the transaction gets committed with two-phase commit.
      3. The transaction state in the old primary’s TransactionParticipant is still AbortedWithoutPrepare when the oplog entry for prepare gets replicated to the old primary so the transaction state assertion here fails and the op applier fails this fatal assertion.

      So the ARS should check to see if the operation with startTransaction=true and not retry if it is. There is already a rough CR patch for testing this case. 

       

            Assignee:
            luis.osta@mongodb.com Luis Osta (Inactive)
            Reporter:
            cheahuychou.mao@mongodb.com Cheahuychou Mao
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated:
              Resolved:

                Error rendering 'slack.nextup.jira:slack-integration-plus'. Please contact your Jira administrators.