[SERVER-35872] Reconstruct prepared transactions on replication rollback Created: 28/Jun/18  Updated: 29/Oct/23  Resolved: 28/Feb/19

Status: Closed
Project: Core Server
Component/s: Replication
Affects Version/s: None
Fix Version/s: 4.1.9

Type: Task Priority: Major - P3
Reporter: Gregory McKeon (Inactive) Assignee: Pavithra Vetriselvan
Resolution: Fixed Votes: 0
Labels: open_todo_in_code, prepare_durability
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
depends on SERVER-35879 Add support for reconstituting transa... Closed
depends on SERVER-38865 Create rollback test fixture that is ... Closed
is depended on by SERVER-39762 Fix fastcount after rollback recovery... Closed
is depended on by SERVER-37886 Remove config server as coordinator c... Closed
Backwards Compatibility: Fully Compatible
Sprint: Repl 2018-12-17, Repl 2019-01-14, Repl 2019-01-28, Repl 2019-02-11, Repl 2019-02-25, Repl 2019-03-11
Participants:

 Description   

After the work for SERVER-35879 goes in, we will already have a way to apply prepare oplog entries during replication recovery. This work includes iterating over the transactions table, finding which sessions had a prepared transaction on them, and applying the prepare oplog entry.

Before we apply these oplog entries, however, we will need to correctly refresh the session state as well as the state of the transaction participant because these will have both been invalidated at the beginning of replication rollback. This must happen before we try to modify the transaction participant (i.e. call unstashTransactionResources or prepareTransaction). There are a couple ways that we can approach this.

First, we could thread a boolean through _recoverFromOplog, _reconstructPreparedTransactions, and applyRecoveredPrepareTransaction. Once we get to applyRecoveredPrepareTransaction, we can check to see if we are recovering from a rollback and refresh the session and transaction participant states.

The second option is to check the OplogApplication mode and if its in OplogApplication::Mode::kRecovering, then refresh the session and transaction participant. Since kRecovering applies to startup recovery AND replication recovery, this would only work if it's safe to do this during startup recovery. During replication recovery, we would not be making any writes to the transactions table, so refreshing the state from disk would not cause us to read those writes and start a new transaction. If the same thing applies to startup recovery, this could be a more elegant solution than the first.

Finally, in both solutions, we would need to introduce a new helper (something like refreshTxnParticipantFromTable) that reconstructs the state of the transaction participant before we cleared it for rollback. This information should be available from the prepare oplog entry.

We would test this via jstests since we would need to induce a rollback and ensure that we have not lost any prepared transactions by the end of the recovery process.



 Comments   
Comment by Githook User [ 28/Feb/19 ]

Author:

{'name': 'Pavi Vetriselvan', 'username': 'pvselvan', 'email': 'pvselvan@umich.edu'}

Message: SERVER-35872 fix TODOs
Branch: master
https://github.com/mongodb/mongo/commit/434347f4fab56a9a749d3698fbf46679f2c73f74

Comment by Judah Schvimer [ 22/Feb/19 ]

pavithra.vetriselvan, I think you missed the TODO jack.mulrow left and left a TODO for this ticket that should point to SERVER-39762.

Comment by Githook User [ 22/Feb/19 ]

Author:

{'name': 'Pavi Vetriselvan', 'email': 'pvselvan@umich.edu', 'username': 'pvselvan'}

Message: SERVER-35872 reconstruct prepared transactions on rollback, fastcount inaccurate
Branch: master
https://github.com/mongodb/mongo/commit/0ab7df179a7329fea4c28049d1ff532010720280

Comment by Jack Mulrow [ 13/Feb/19 ]

Just a heads up - I'm adding a TODO on this ticket in a test for SERVER-36498 for behavior relying on transactions that aborted after being prepared entering the kAbortedWithPrepare state after refreshing from storage.

Comment by Githook User [ 24/Jan/19 ]

Author:

{'email': 'pvselvan@umich.edu', 'name': 'Pavi Vetriselvan', 'username': 'pvselvan'}

Message: SERVER-35872 reconstruct prepared transactions on replication rollback
Branch: master
https://github.com/mongodb/mongo/commit/c215687d366bef79cd821d69899e9d2689e9fd6f

Comment by Samyukta Lanka [ 28/Nov/18 ]

We need to make sure that we are correctly refreshing state when a session has been invalidated.

Generated at Thu Feb 08 04:41:17 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.