[SERVER-37179] Wait for specified write concern whenever commitTransaction returns a NoSuchTransaction error Created: 17/Sep/18  Updated: 29/Oct/23  Resolved: 13/Nov/18

Status: Closed
Project: Core Server
Component/s: Replication
Affects Version/s: None
Fix Version/s: 4.0.7, 4.1.6

Type: Bug Priority: Major - P3
Reporter: Spencer Brody (Inactive) Assignee: Siyuan Zhou
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Backports
Depends
is depended on by SERVER-36311 Add stepdowns, shutdowns, and crashes... Closed
Duplicate
is duplicated by SERVER-37516 Providing readConcern on second trans... Closed
Gantt Dependency
has to be done before SERVER-34620 Make speculative read atClusterTime n... Closed
Problem/Incident
causes SERVER-37876 Unblacklist multi_statement_transacti... Backlog
Related
related to SERVER-37516 Providing readConcern on second trans... Closed
related to SERVER-37681 Make it clear from the stack trace w... Closed
is related to SERVER-37181 commitTransaction command can attach ... Closed
Backwards Compatibility: Fully Compatible
Operating System: ALL
Backport Requested:
v4.0
Sprint: Repl 2018-10-22, Repl 2018-11-05, Repl 2018-11-19
Participants:
Linked BF Score: 11

 Description   

If a transaction commits locally against a primary, but that primary goes down without replicating the commit, then if the commitTransaction command is retried against the new primary it will error with NoSuchTransaction. Generally a NoSuchTransaction error in this case indicates that the transaction either aborted or rolled back, and thus it would be safe to retry the entire transaction over. If, however, the writeConcern times out in this case then its possible that the original primary could be re-elected without rolling back the txn commit, and thus the transaction could wind up surviving after all.



 Comments   
Comment by Githook User [ 05/Mar/19 ]

Author:

{'name': 'Siyuan Zhou', 'username': 'visualzhou', 'email': 'siyuan.zhou@mongodb.com'}

Message: SERVER-37179 Wait for specified write concern whenever commitTransaction returns a NoSuchTransaction error

(cherry picked from commit 6394bfafd5c42bfeb01b6686498d7fff697d9480)
Branch: v4.0
https://github.com/mongodb/mongo/commit/38896cab6a0aad9bca1f9e9f5de65ea0cdc62d22

Comment by Githook User [ 04/Mar/19 ]

Author:

{'name': 'Siyuan Zhou', 'username': 'visualzhou', 'email': 'siyuan.zhou@mongodb.com'}

Message: SERVER-37179 Pull out starting transaction from session checkout and push it down to before command execution.

This patch redid 248601a647 and 4fb38d9c10 from master on v4.0 branch.

Transaction will begin or continue after waiting for read concern. If
an error is thrown on starting transaction, it'll be able to wait for
write concern if a write concern is specified.
Branch: v4.0
https://github.com/mongodb/mongo/commit/5366d3c6ea014f1bd19eee1a149f46f3b1227a2b

Comment by Githook User [ 13/Nov/18 ]

Author:

{'name': 'Siyuan Zhou', 'email': 'siyuan.zhou@mongodb.com', 'username': 'visualzhou'}

Message: SERVER-37179 Wait for specified write concern whenever commitTransaction returns a NoSuchTransaction error
Branch: master
https://github.com/mongodb/mongo/commit/6394bfafd5c42bfeb01b6686498d7fff697d9480

Comment by Githook User [ 08/Nov/18 ]

Author:

{'name': 'Siyuan Zhou', 'email': 'siyuan.zhou@mongodb.com', 'username': 'visualzhou'}

Message: SERVER-37179 Pull out starting transaction from session checkout and push it down to before command execution.

Transaction will begin or continue after waiting for read concern. If
an error is thrown on starting transaction, it'll be able to wait for
write concern if a write concern is specified.
Branch: master
https://github.com/mongodb/mongo/commit/4fb38d9c10123321dada6fe1be477f9cb99732a7

Comment by Githook User [ 25/Oct/18 ]

Author:

{'name': 'Siyuan Zhou', 'email': 'siyuan.zhou@mongodb.com', 'username': 'visualzhou'}

Message: SERVER-37179 Pass the reference of OperationSessionInfoFromClient around.
Branch: master
https://github.com/mongodb/mongo/commit/248601a6473fc7364e5d790a357acbace2a42f7a

Comment by Siyuan Zhou [ 10/Oct/18 ]

According to the design of "Single Replica Set Transactions":

The ‘commitTransaction’ and ‘abortTransaction’ commands are the only commands of a multi-statement transaction that allow a writeConcern argument. If a writeConcern argument is given on any other command of a transaction, the server will return an error, without affecting the database or the transaction state. The writeConcern argument of the ‘commitTransaction’ and ‘abortTransaction’ commands will have semantics analogous to existing replica set commands.

Also from the documentation:

You can set the write concern for the transaction commit at the transaction start.

  • If unspecified at the transaction start, transactions use the session-level write concern for the commit or, if that is unset, the client-level write concern.
    Write concern w: 0 is not supported for transactions.
  • If you commit using w: 1 write concern, your transaction can be rolled back if there is a failover.
  • If the transaction commits with write concern “majority” and has specified read concern "snapshot" read concern, transaction operations are guaranteed to have read from a snapshot of majority-committed data. Otherwise, the "snapshot" read concern provides no guarantees that read operations used a snapshot of majority-committed data.
  • If the transaction commits with write concern “majority” and has specified read concern "majority" read concern, transaction operations are guaranteed to have read majority-committed data. Otherwise, the "majority" read concern provides no guarantees that read operations read majority-committed data.

There are two problems in this ticket: 1) we don't wait for write concern if NoSuchTransaction is returned. 2) we shouldn't attach transient transaction error label if the given write concern fails. I remove the mention of "majority" in the title.

Comment by Judah Schvimer [ 10/Oct/18 ]

What is the expected behavior if the writeconcern is not majority?

Comment by Janna Golden [ 09/Oct/18 ]

I'm seeing the following scenario in testing against failover:
1. Node 0 is primary and running txnNum: 0 including a command with {insert : {_id: 0}}
2. Node 0 encounters a network error during commitTransaction with writeConcern: majority
3. Node 1 steps up and retries the commitTransaction
4. Node 1 gets NoSuchTransaction and aborts txnNum: 0
5. Node 1 retries the entire transaction with txnNum: 1 including a command with {insert : {_id: 0}}
6. Node 1 encounters a network error during commitTransaction with wc: majority
7. Node 0 steps back up and retries the commitTransaction
8. Node 0 gets NoSuchTransaction and aborts txnNum: 1
9. Node 0 retries the entire transaction with txnNum: 2 including a command with {insert : {_id: 0}}
10. Node 0 gets a duplicate key error

Comment by Spencer Brody (Inactive) [ 21/Sep/18 ]

Really the only time commitTransaction should attach a TransientTransactionError is specifically the case where it fails with NoSuchTransaction AND the writeConcern:majority wait is successful.

Generated at Thu Feb 08 04:45:15 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.