[SERVER-39762] Fix fastcount after rollback recovery of prepared transactions. Created: 22/Feb/19  Updated: 29/Oct/23  Resolved: 27/Mar/19

Status: Closed
Project: Core Server
Component/s: Replication, Storage
Affects Version/s: None
Fix Version/s: 4.1.10

Type: Bug Priority: Major - P3
Reporter: Pavithra Vetriselvan Assignee: Louis Williams
Resolution: Fixed Votes: 0
Labels: prepare_durability, txn_storage
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: PNG File txn-rollback.png    
Issue Links:
Depends
depends on SERVER-35872 Reconstruct prepared transactions on ... Closed
Related
related to SERVER-40482 incorrect fastcount for a majority co... Closed
related to SERVER-40973 Incorrect fast count after reconstruc... Closed
is related to SERVER-35483 rollback makes config.transactions fa... Closed
is related to SERVER-40269 commitTransaction should assert that ... Closed
Backwards Compatibility: Fully Compatible
Operating System: ALL
Sprint: Storage NYC 2019-03-25, Storage NYC 2019-04-08
Participants:

 Description   

When we recover a prepared transaction, we re-apply the oplog entry and replay the operations in the transaction to get back to the prepare state. This line prevents us from adding a NumRecordsChange on the new recovery unit.

During rollback, say we commit a prepared transaction on the rollback node. This entry will get rolled back, the transaction will get invalidated (which releases our txnResources, including the recovery unit), and we will replay oplog entries from the stable timestamp. This involves putting the corresponding prepared transaction back into prepare.

Since the new recovery unit does not register the operations of the prepared transaction, when we decide to either commit/abort it, there is nothing to add or subtract to numRecords.

Previously, we have accounted for counts during rollback in _findRecordStoreCounts, where we calculate the diffs for each UUID. For prepared transactions, however, we are not sure if we will eventually commit or eventually abort the transaction. So, it might not make sense to preemptively change the counts.

Ideally, we would figure out a way for the new recovery unit to make an exception for prepared transactions that were recovered during rollback and make sure the changes get recorded when reapplying operations.

 



 Comments   
Comment by Githook User [ 27/Mar/19 ]

Author:

{'name': 'Louis Williams', 'username': 'louiswilliams', 'email': 'louis.williams@mongodb.com'}

Message: SERVER-39762 Fix fastcount after rollback recovery of prepared transactions
Branch: master
https://github.com/mongodb/mongo/commit/884f388d3eca3bd5e9fe32f46b3007ccdd89af17

Comment by Max Hirschhorn [ 21/Mar/19 ]

Thanks for the confirmation both of you. I put together

in order to keep the valid transitions straight in my mind.

Comment by Louis Williams [ 21/Mar/19 ]

max.hirschhorn Yes, that is correct. If a node rolls back a "commit" or "abort" back into a prepared state, the sharded transaction protocol guarantees the same command will be sent again; it will not change its mind. If that were not the case, we would need to do some amount of work to make the fastcount logic work differently for rollback.

Comment by Judah Schvimer [ 21/Mar/19 ]

Where does the guarantee that a different outcome won't ever occur on the two branches of history come from? Is it because of how prepared transactions are used by the distributed commit protocol where the coordinator must not change its mind as whether to commit or abort a transaction once it is prepared on all of the shards?

Exactly.

Comment by Max Hirschhorn [ 21/Mar/19 ]

rollback "abort" into "prepare" (guaranteed to eventually abort)
rollback "commit" into "prepare" (guaranteed to eventually commit)

Where does the guarantee that a different outcome won't ever occur on the two branches of history come from? Is it because of how prepared transactions are used by the distributed commit protocol where the coordinator must not change its mind as whether to commit or abort a transaction once it is prepared on all of the shards?

Modeling these semantics is important to me for implementing prepared transaction support in the rollback fuzzer.

Comment by Louis Williams [ 20/Mar/19 ]

After some discussion, it seems there has been agreement about the desired behavior of fastcount during rollback. The desire is to do no additional work, only confirm we have testing to cover all cases.

Keep in mind that we update the fastcount in memory, then rollback if the storage transaction aborts.

There are 5 cases to consider:

  • rollback "abort" into "prepare" (guaranteed to eventually abort)
    • When a transaction is aborted, the fast count is also rolled back. When the prepared transaction is "reconstructed" at the end of replication recovery by applying the operations and creating a new prepared storage transaction, the fastcount is not adjusted. When the transaction is aborted again, there is nothing to roll back. This leaves the fastcount in the correct state.
    • I have confirmed locally this is true, but I will use this ticket to add assertions.
  • rollback "abort" and "prepare" (guaranteed to never commit, but could prepare again)
    • When a transaction is aborted, the fast count is also rolled back. Rollback of the "prepare: true" applyOps oplog entry calculates the fastcount adjustment to make once recovery is complete, and then also subtracts that from the collection count.
    • I don't believe this is correct, so I will use this ticket to investigate and fix this. We may be able to just ignore "prepare: true" oplog entries for consideration of size adjustment.
  • rollback "commit" into "prepare" (guaranteed to eventually commit)
    • When a transaction is committed, the fast count is unmodified. When the prepared transaction is reconstructed, like the first case, there are no fastcount adjustments made, and when the transaction eventually commits, the counts also remain unmodified.
    • I have confirmed locally this is true, but I will use this ticket to add assertions.
  • rollback "prepare" (could eventually commit or abort)
    • During rollback, prepared transactions are aborted, which rolls back the fastcount. If the same "prepare" operation comes in at a later point, it will just re-apply the same operations as before.
    • See rollback "abort" and "prepare", but I believe this is affected by the same double-counting bug. The abort before rollback resets the fastcount, and then the rollback of the "prepare" entry, subtracts the same counts again.
  • rollback "commit" and "prepare"
    • This is impossible because "prepare" must be majority committed before sending "commit"

 

Comment by Louis Williams [ 18/Mar/19 ]

I don't think this is going to be as simple as just recording op counts while reconstructing prepare oplog entries. I think we'll need first locate the prepare oplog entries, calculate the operation counts per namespace, and then subtract those from each collection's in-memory count. 

Comment by Judah Schvimer [ 25/Feb/19 ]

Due to SERVER-35483, this won't be able to fix fastcount for config.transactions.

Generated at Thu Feb 08 04:53:02 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.