Shards acting as routers do not append transaction participants to error response when primary is force killed

    • Type: Bug
    • Resolution: Unresolved
    • Priority: Major - P3
    • None
    • Affects Version/s: None
    • Component/s: None
    • None
    • Replication
    • ALL
    • 200
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      BF-40166 saw the following sequence of events:

      1. rs0:n01 is acting as a router, it gets a start txn request and is retrying it and adding rs1 as a participant every time with snapshot readConcern set with a cluster time {{ {"t":1761295748,"i":107}

        }}

      2. continuousStepdown hook force kills the primary on rs0 (n1), and n1 is elected as the new primary.
      3. When rs0:n0 }}is force killed, it does not populate the full error response here that would append the transaction participants in the response to mongos, so {{rs1 is not cleared/aborted and its snapshot read timestamp is still {{ {"t":1761295748,"i":107}

        }}

      4. mongos0 now retries the txn again, now on s0:n1 and it hits this retry function due to stale config error (we retry on transient config errors). We are no longer aware of rs1 as a pending participant, so it doesn't clear it and resets the snapshot read time as {{ {"$timestamp":\{"t":1761295748,"i":195}

        }}

      5. Now it tries to add rs1 as a participant again, and attaches the new {{ {"$timestamp":\{"t":1761295748,"i":195}

        }} value. This then conflicts with the stashed txnResource snapshot read time on rs1 which is still {{

        {"t":1761295748,"i":107}

         }}and this is a non-retryable error so we abort the txn and fail the test.

      We never ran into this previously because we weren’t retrying on the mongos level if it was a startTransaction, but since that was removed in SERVER-88289 we will need to account for this scenario. This ticket removed that check because we improved the session/txn invalidation so we would be able to retry a txn request on a new primary, but this case evades that because the force kill follows a different error handling code path.

      One possible solution could be appending the pending participants to the error response to mongos in step down - I'm trying to repro this part of the failure so we can compare the error responses for step down to the errors that go through CheckoutSessionAndInvokeCommand::_tapError

            Assignee:
            Unassigned
            Reporter:
            Ruchitha Rajaghatta
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Created:
              Updated: