Uploaded image for project: 'Drivers'
  1. Drivers
  2. DRIVERS-2169

Transaction test expectation contrary to retryable write requirements

    • Type: Icon: Spec Change Spec Change
    • Resolution: Unresolved
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Component/s: Transactions
    • None
    • Needed

      The "commitTransaction retry fails on new mongos" test in https://github.com/mongodb/specifications/blob/master/source/transactions/tests/mongos-recovery-token.yml#L177 performs two commitTransaction operations; the first one is failed with a socket error via a fail point, the second one fails with an OperationFailure. The test expects the second failing operation to have a transient error label, however as far as I can tell this is contrary to retryable writes spec requirements per https://github.com/mongodb/specifications/blob/master/source/retryable-writes/retryable-writes.rst#executing-retryable-write-commands, which require that a non-socket error, non-not master error causes the original failure to be reraised rather than propagating the retry failure, specifically the following bit of pseudocode:

       } catch (DriverException ignoredError) {
         throw originalError;
       }
      

      In case of the commit test in question, the original error has the unknown result label, and the retried error has the transient label. The test as currently written expects the error to have the transient label, implicitly requiring that the retried error is propagated, but in Ruby the test produces an exception with the unknown result label, indicating that the original error was propagated.

      Looking at Python driver which is the reference implementation for transactions, my impression is it always raises the last OperationFailure encountered when doing write retries, and hence would raise the last error contrary to the pseudocode quoted above. The relevant spec change for raising original error is https://github.com/mongodb/specifications/commit/84f0fb9043e7bf2b04e74c9072f56013a97a5073

      Since error details were requested in slack, I am providing them below.

      first error:

      [#<Mongo::Error::SocketError: EOFError: end of file reached (for 127.0.0.1:27571 (no TLS))>, ["UnknownTransactionCommitResult"]]

      second error:

      [#<Mongo::Error::OperationFailure: Recovering the transaction's outcome found the transaction aborted (251)>, ["TransientTransactionError"]]

            Assignee:
            Unassigned Unassigned
            Reporter:
            oleg.pudeyev@mongodb.com Oleg Pudeyev (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated: