[SERVER-40482] incorrect fastcount for a majority committed prepared transaction that is in prepare after a rollback and then committed Created: 04/Apr/19 Updated: 29/Oct/23 Resolved: 12/Apr/19 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Replication |
| Affects Version/s: | None |
| Fix Version/s: | 4.1.11 |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Pavithra Vetriselvan | Assignee: | Louis Williams |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | prepare_durability, txn_storage | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Operating System: | ALL | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Steps To Reproduce: |
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Sprint: | Storage NYC 2019-04-08, Storage NYC 2019-04-22 | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Participants: | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Linked BF Score: | 0 | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Description |
|
This came out of My understanding is that we add to the count when we put a transaction into prepare. We then subtract from the count when we abort the storage transaction of a prepared transaction during rollback. We do not change the counts when reconstructing the prepared transaction during recovery. This is not a problem if the prepared transaction is eventually aborted. However, if this transaction is eventually committed, the counts are incorrect. |
| Comments |
| Comment by Githook User [ 12/Apr/19 ] |
|
Author: {'name': 'Louis Williams', 'username': 'louiswilliams', 'email': 'louis.williams@mongodb.com'}Message: |
| Comment by Githook User [ 12/Apr/19 ] |
|
Author: {'name': 'Louis Williams', 'username': 'louiswilliams', 'email': 'louis.williams@mongodb.com'}Message: This fixes two bugs, both related the correctness of the algorithm for adjusting collection counts The new high-level order of operations during replication rollback are as follows: |
| Comment by Louis Williams [ 08/Apr/19 ] |
Yes, the idea is that after recovery to the stable timestamp, all prepared transactions regardless of the commit/abort decision have rolled-back counts. From that point, the in-memory fastcounts are reconstructed with each prepared transaction and updated accordingly as they commit or abort. As far as |
| Comment by Judah Schvimer [ 08/Apr/19 ] |
If prepare is rolled back, then we're fine since we aborted the prepared transaction. I think the problem here is if the stable timestamp is behind the prepare oplog entry, but the common point is after the prepare oplog entry (prepare does not get rolled back) so the transaction gets re-prepared without changing fastcount.
I'm confused by this solution. By "after rollback" do you mean after "recover to a timestamp" but before we reconstruct prepared transactions? From your previous sentence it seemed to me that after prepared transactions are reconstructed they would have fastcounts in them. I do think that if my understanding is correct that this (1) is the simpler solution. It returns us to a state where _countDiffs is complete and the only logical change from 4.0 lives in the "reconstructing prepared transactions" logic, which is already new. Before moving ahead, I'm interested in if |
| Comment by Louis Williams [ 05/Apr/19 ] |
|
The issue here describes incorrect counts in the following cases:
In either case, aborting a prepared transaction before a commit or abort decision has been received means that a prepared transaction will roll-back counts even though the commit/abort decision is unknown. During recovery, no new in-memory fastcount is added, so a future commit will not update appropriately either. The solution I have is this: while prepared transactions are reconstructed after oplog replay, make the operations count toward collection size adjustments by setting this flag to "false". It will then be necessary to also do one of the following beforehand:
Both solutions require processing "commitTransaction" operations and traversing back in the oplog to find the associated "prepare" entry. I think the first is a little more simple and doesn't require an additional data structure. judah.schvimer let me know what you think of this approach or if there is something I've missed. |
| Comment by Judah Schvimer [ 04/Apr/19 ] |
|
|