[SERVER-42699] Server should perform majority write concern no-op write before returning TransientTransactionError label on commitTransaction cmd. Created: 08/Aug/19 Updated: 10/Dec/19 Resolved: 10/Dec/19 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Replication |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Task | Priority: | Major - P3 |
| Reporter: | Suganthi Mani | Assignee: | Tess Avitabile (Inactive) |
| Resolution: | Won't Fix | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Sprint: | Repl 2019-12-30 |
| Participants: |
| Description |
|
For replica set transaction, we can get into double committing problem for below scenario (see SPEC-1185).
Currently TTE label is attached to the following error codes.
But, only for ErrorCodes::NoSuchTransaction, we perform no-op writes w/ {w: majority} and check the write concern error to attach TTE label. Its safe now not to do majority write concern no-op writes for other error codes which can generate TTE label. Because, we currently perform below sequence of steps for commitTransaction cmd (for unprepared transactions/ single replica sets). To be noted, commitTxn cmd can't contain 'startTransaction'. So, it should always go to _continueMultiDocumentTransaction(). The system won't be safe, if we get any TTE label error codes from 2 thru 5 (ErrorCodes::WriteConflict to Snapshot error) before seq #1 beginOrContinue() (i.e. _continueMultiDocumentTransaction()). This means we should perform no-op writes {w:majority} and check for majority write concern error before attaching the TTE label for all TTE error codes. Performing no-op writes always has a performance cost. So, instead we should try to strengthen the contract of TTE label itself. Note: For Cross-shard transaction, the txn coordinator doesn't care about TTE label attached to the commitTransaction cmd response. |
| Comments |
| Comment by Suganthi Mani [ 12/Aug/19 ] |
|
My proposal for TTE error label to be safe to retry the entire transaction from beginning is that, before sending TTE label response for commitTransaction cmd, we should make sure we perform one of 2 things: 1) For active txn (i.e. txn number matches the current active txn number), we should abort the transaction implicitly if it's not aborted previously. Only for case#2, we would require to perform a majority write concern no-op (only if the write concern supplied with cmd was kMajority). Otherwise, its safe to send the TTE label response w/o no-op write. Note: Currently in our code, NoSuchTransaction error code is thrown in cases where we don't require no-op write. CC esha.maharishi shane.harvey |
| Comment by Judah Schvimer [ 08/Aug/19 ] |
|
suganthi.mani, should we also abort a transaction whenever we return a TTE error label? |