[DRIVERS-2104] The transaction spec says to unpin after MaxTimeMSExpired on commit but there's a test for the opposite behavior Created: 12/Jun/19  Updated: 31/Mar/22

Status: Backlog
Project: Drivers
Component/s: Transactions
Fix Version/s: None

Type: Spec Change Priority: Major - P3
Reporter: Shane Harvey Assignee: Unassigned
Resolution: Unresolved Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
is depended on by CDRIVER-3302 Unskip "remain pinned after non-trans... Blocked
is depended on by JAVA-3458 Unskip "remain pinned after non-trans... Closed
Related
Driver Changes: Needed

 Description   

The spec says:

Additionally, drivers MUST unpin a ClientSession when any individual commitTransaction command attempt fails with an UnknownTransactionCommitResult error label. In cases where the UnknownTransactionCommitResult causes an automatic retry attempt, drivers MUST unpin the ClientSession before performing server selection for the retry.

And MaxTimeMSExpired is defined as an UnknownTransactionCommitResult error. So a MaxTimeMSExpired error when running commitTransaction should unpin the session. However the "remain pinned after non-transient error on commit" test asserts the opposite behavior; that a session remains pinned after a MaxTimeMSExpired error on commit.



 Comments   
Comment by Oleg Pudeyev (Inactive) [ 27/Dec/19 ]

In https://jira.mongodb.org/browse/SPEC-1378, the test which used to assert that maxtimems error did not unpin was changed to use a different error (and still not unpin), because maxtimems error unpinned. I believe we are all in agreement that the new test is correct.

There is also a test for unpinning on a transient error. This test uses a network error (via fail point) to obtain the transient error.

vincent.kam proposed that a test be added for maxtimems operation failure, which would also unpin. Although in principle the same code path in the driver should handle an operation failure with transient label and a network error, for the purposes of unpinning, I would say the maxtimems test is meaningfully different from the integration standpoint and I support adding it.

For convenience of reference this is the test that Vincent proposed:

- description: unpin after UnknownTransactionCommitResult error on commit
    useMultipleMongoses: true
    operations:
      - name: startTransaction
        object: session0
      - name: insertMany
        object: collection
        arguments:
          documents:
            - _id: 3
            - _id: 4
          session: session0
        result:
          insertedIds: {0: 3, 1: 4}
      # Session is pinned.
      - *assertSessionPinned
      # Enable the fail point only on the Mongos that session0 is pinned to.
      # Fail isMaster to prevent the heartbeat requested directly after the
      # insert error from racing with server selection for the commit.
      # Note: times: 7 is slightly artbitrary but it accounts for one failed
      # insert and some SDAM heartbeats. A test runner will have multiple
      # clients connected to this server so this fail point configuration
      # is also racy.
      - name: targetedFailPoint
        object: testRunner
        arguments:
          session: session0
          failPoint:
            configureFailPoint: failCommand
            mode: { times: 7 }
            data:
              failCommands: ["isMaster"]
              closeConnection: true
      # Fail the commit with a MaxTimeMSExpired error.
      - name: targetedFailPoint
        object: testRunner
        arguments:
          session: session0
          failPoint:
            configureFailPoint: failCommand
            mode: { times: 1 }
            data:
              failCommands: ["commitTransaction"]
              errorCode: 50 # MaxTimeMSExpired 
      - name: commitTransaction
        object: session0
        result:
          errorLabelsOmit: ["TransientTransactionError"]
          errorLabelsContain: ["UnknownTransactionCommitResult"]
          errorCode: 50
      # Session unpins from the first mongos after the commit transaction error and
      # commitTransaction selects the second mongos which is unaware of the
      # transaction and therefore fails with NoSuchTransaction error. If this
      # commit succeeds it indicates a bug, either:
      # - the driver mistakenly remained pinned even after the insert error, or
      # - the test client was initialized with a single mongos seed
      #
      # Note that the commit attempt should not select the original mongos
      # because that server's SDAM state is reset by the connection error,
      # heartbeatFrequencyMS is high, and subsequent isMaster heartbeats
      # should fail.
      - name: commitTransaction
        object: session0
        result:
          errorLabelsContain: ["TransientTransactionError"]
          errorLabelsOmit: ["UnknownTransactionCommitResult"]
          errorCodeName: NoSuchTransaction

I also agree with shane.harvey's second point that maxtimemsexpired should not retry because doing so would double the timeout, though as I understand it this would be a spec change and not just a test change.

Comment by Vincent Kam (Inactive) [ 12/Dec/19 ]

I agree with Shane. The .NET Driver also ran into this ambiguity while implementing CSHARP-2858.

cc: dmitry.lukyanov oleg.pudeyev

Comment by Shane Harvey [ 12/Dec/19 ]

I think this ticket still needs some work. The spec is ambiguous about these two points:

  1. Should a MaxTimeMSExpired error on commit cause the session to be unpinned?
  2. Should a MaxTimeMSExpired error on commit cause an automatic retry attempt?

For 1, the answer is yes. MaxTimeMSExpired errors on commit are labeled UnknownTransactionCommitResult which will cause the session to be unpinned. We should add a test for this.

For 2, the answer should be no. We should not retry after MaxTimeMSExpired error on commit because it would implicitly double the application’s maxCommitTimeMS. We should add a test to cover this case. It will be similar to the "commit is not retried after MaxTimeMSExpired error" withTransaction test.

Comment by Jeffrey Yemin [ 26/Nov/19 ]

shane.harvey is this ticket superseded by DRIVERS-764?

Comment by Githook User [ 12/Aug/19 ]

Author:

{'name': 'Dan Aprahamian', 'email': 'dan.aprahamian@gmail.com', 'username': 'daprahamian'}

Message: test(transactions): disable invalid pinning test

Disables test 'remain pinned after non-transient error on commit'
until SPEC-1320 is resolved
Branch: next
https://github.com/mongodb/node-mongodb-native/commit/18f38a17a9739b6380e6a2c0a77ecfdb6ea9068f

Generated at Thu Feb 08 08:24:44 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.