[SERVER-38132] Aborting a transaction must always update config.transactions and write an abort oplog entry Created: 14/Nov/18 Updated: 08/Jan/19 Resolved: 04/Jan/19 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Replication |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Task | Priority: | Major - P3 |
| Reporter: | Tess Avitabile (Inactive) | Assignee: | Tess Avitabile (Inactive) |
| Resolution: | Won't Fix | Votes: | 0 |
| Labels: | prepare_durability | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||||||||||
| Sprint: | Repl 2018-12-03, Repl 2018-12-17, Repl 2019-01-14 | ||||||||||||||||||||
| Participants: | |||||||||||||||||||||
| Description |
|
In FCV 4.2, the TransactionParticipant must always update config.transactions and write an abort oplog entry when aborting a transaction. It must do so before updating the in-memory state. We should test that secondaries can apply abort oplog entries for unprepared transactions. Secondaries performing read-only transactions must not attempt to write abort oplog entries. Note that it might be technically difficult and unnecessary to update config.transactions and write an abort oplog entry when aborting a transaction due to hearing of a higher transaction number. Once the client uses a higher transaction number, they forgo the right to recover the state of older transactions. We will only write an abort oplog entry in FCV 4.2, since 4.0 nodes cannot process abort oplog entries. This is fine, since it is only necessary to perform a write when aborting a transaction when using the session through multiple mongoses. |
| Comments |
| Comment by Tess Avitabile (Inactive) [ 04/Jan/19 ] |
|
Closing in favor of |
| Comment by Tess Avitabile (Inactive) [ 04/Jan/19 ] |
|
Yes, secondary transactions are only used for background dbhash today. We would need to be careful, since it's only acceptable to skip writing an abort oplog entry if the transaction did no writes and we received startTransaction for the transaction. We would still need to write an abort oplog entry if we receive a higher transaction number than we have heard of without startTransaction. |
| Comment by Esha Maharishi (Inactive) [ 04/Jan/19 ] |
|
Though, even read-only transactions will have to perform the "linearizable read" equivalent in order for using a recovery router to recover a sharded transaction's decision to be correct. |
| Comment by Esha Maharishi (Inactive) [ 04/Jan/19 ] |
|
Only writing an abort oplog entry if aborting a transaction that performed a write may also make enabling transactions on secondaries in the future easier. |
| Comment by Esha Maharishi (Inactive) [ 04/Jan/19 ] |
|
Is the dbhash check the only complication? Also, have we thought about what aborting a read-only transaction should mean? Perhaps it is acceptable to only write an abort oplog entry if aborting a transaction that performed a write? |
| Comment by Tess Avitabile (Inactive) [ 04/Jan/19 ] |
|
esha.maharishi, judah.schvimer, schwerin, please consider siyuan.zhou's argument above that writing an abort oplog entry is overly complex, since performing a noop write fixes the original problem in all cases except for one-shot transactions, which are not supported in 4.2. I am on the fence about which solution is preferable. One the one hand, I feel more confident that writing an abort oplog entry will cover any counterexamples we have not yet thought of. On the other hand, the implementation for writing an abort oplog entry is complex, since there are certain cases where we cannot write an abort oplog entry (e.g. because we are a secondary) and instead invalidate the session, and we argue that that is correct based on the fact that the next time the transaction is used, we will write an abort oplog entry before returning NoSuchTransaction. The implementation and correctness argument for the noop write is simpler. |
| Comment by Siyuan Zhou [ 04/Jan/19 ] |
|
Also to summarize the discussion with tess.avitabile about the choice between writing "abort" oplog entry and linearizable read style waiting: linearizable read will fix the original problem described in this ticket, because a returned NoSuchTransaction is either from the original primary or from a new primary that knows previous transaction can never survive failover. Thus the transaction can safely retry with a higher transaction number. The multi-statement transaction protocol guarantees once the server sees commitTransaction, the client will never restart the transaction with the same transaction number, making the retry safe. One-shot transactions will effectively change this assumption and cause the problem Tess mentioned earlier. However we won't support one-shot transaction in 4.2. Given the complexity of writing "abort" oplog entry than reusing linearizable read, it's worth considering the minimal change to fix the problem. |
| Comment by Max Hirschhorn [ 03/Jan/19 ] |
|
Just to summarize what tess.avitabile, siyuan.zhou, and I discussed in-person: Given that the run_check_repl_dbhash_background.js hook only performs reads within a transaction, it should always be safe to retry and doesn't face the same risk for exactly once semantics around retrying an entire transaction on TransientTransactionError. Tess is going to look into how much work it'd be to have the abortTransaction command not write an oplog entry if the transaction did no writes. We can otherwise consider having the run_check_repl_dbhash_background.js hook kill the sessions to abort the transaction rather than running the abortTransaction command directly. |
| Comment by Tess Avitabile (Inactive) [ 03/Jan/19 ] |
|
max.hirschhorn, I am interested in removing support for secondary read-only transactions as part of this work. With this work, we guarantee that we write an abort oplog entry whenever we transition a transaction's state to "aborted" in memory. However, we cannot write oplog entries when aborting transactions on secondaries. It would be complex and risky to make an exception for aborting transactions on secondaries when test commands are enabled. Of course, removing support for secondary read-only transactions would mean removing our background dbhash testing in master. Does this sound acceptable to you? |
| Comment by Tess Avitabile (Inactive) [ 02/Jan/19 ] |
|
Justification for this work: It is necessary to perform a write when aborting a transaction so that commands which return NoSuchTransaction and wait for writeConcern are guaranteed that a write was done since the transaction was aborted. Otherwise, we could have a scenario where we have term 4 primary which has committed all the writes in its term and a term 5 primary which has committed transaction 10. A command for transaction 10 targeting the term 4 primary would respond with NoSuchTransaction with no writeConcern error and with a TransientTransactionError label, since the term 4 primary had committed all of the writes in its term. The application could then retry the entire transaction and it would be committed twice. This cannot occur in 4.0, since transactions are only supported on single replica sets and drivers will never target an older primary unless they are restarted, in which case they will use a different session. However, it could occur in 4.2, since mongos does not make this guarantee, and if commands are routed through multiple mongoses, a later command could certainly target an older primary. One solution is to perform a noop write on abort, similar to a linearizable read. This would address the scenario described above. However, it would be insufficient in the case of one-shot transactions (currently the test-only doTxn command) and delayed messages. Consider this scenario: The application sends doTxn for transaction 10 to a 1-node replica set. The message is delayed, and the application times out waiting for a response. The application sends commitTransaction for transaction 10 to see if the transaction was committed. The node begins and aborts transaction 10 and performs a noop write, then returns NoSuchTransaction with no writeConcern error and with a TransientTransactionError label. Then the node restarts, so all record of transaction 10 is gone. Finally, the doTxn command arrives, and the node commits transaction 10. However, since the application believes the transaction was definitively not committed, it may attempt to retry the entire transaction. For this reason, we choose to write an abort oplog entry and update config.transactions, so that there is a durable record of the aborted transaction. CC siyuan.zhou |
| Comment by William Schultz (Inactive) [ 11/Dec/18 ] |
|
If we are now writing abortTransaction oplog entries for single replica set transactions, we may need to consider supporting this in rollbackViaRefetch, since replica set transactions are supported in 4.2 when enableMajorityReadConcern=false, and we use rollbackViaRefetch in this case. I'm not sure if we actually have any existing test coverage of rolling back an aborted transaction, since previously an aborted transaction would have put nothing into the oplog. For easy, broad test coverage of this, we could consider adding transaction aborts to the rollback fuzzer. We already added transactions to it as a part of TIG-871. |