[SERVER-41779] reconstructPreparedTransactions fails to read a prepare oplog entry during initial sync Created: 14/Jun/19  Updated: 29/Oct/23  Resolved: 20/Jun/19

Status: Closed
Project: Core Server
Component/s: Replication
Affects Version/s: None
Fix Version/s: 4.2.0-rc2, 4.3.1

Type: Bug Priority: Major - P3
Reporter: Lingzhi Deng Assignee: Lingzhi Deng
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Backports
Depends
Backwards Compatibility: Fully Compatible
Operating System: ALL
Backport Requested:
v4.2
Sprint: Repl 2019-07-01
Participants:
Linked BF Score: 12

 Description   

reconstructPreparedTransactions could fail to read an oplog entry during initial sync when:
1. The first attempt of initial sync fails after applying some oplog entries, which leaves the localSnapshot pointing to the lastApplied.
2. The second attempt of initial sync does try to reset all the optimes before starting the new attempt. ReplicationCoordinatorImpl::resetMyLastOpTimes relies on calling ReplicationCoordinatorImpl::_setMyLastAppliedOpTimeAndWallTime to reset the lastApplied and the localSnapshot back to OpTime 0. But ReplicationCoordinatorImpl::_setMyLastAppliedOpTimeAndWallTime skips resetting the localSnapshot if the given OpTime isNull(). Because of this bug, the localSnapshot is still pointing to the last oplog entry applied during the first attempt.
3. If the second attempt doesn't need to apply any ops after data cloning, it inserts the last oplog entry as the oplog seed document using the timestamp of that oplog entry. In order to trigger the bug, the last oplog entry that inserted as the seed has to be a prepare oplog entry and its OpTime has to be greater than the OpTimes of the oplog entries applied in (1).
4. After the second attempt successfully finishes, reconstructPreparedTransactions is called to reconstruct outstanding prepared transactions. In this case, it needs to read the oplog seed entry.
5. reconstructPreparedTransactions uses its own ReadSourceScope but the transactions table read implicitly changes the read source to kLastAppled which is then used by the oplog read.
6. Because of the bug in (2), oplog read is using the lastApplied timestamp (that was set in (1) but failed to be reset in (2)) that is earlier than the prepare oplog entry and thus fails to read the entry.

There are two solutions to this:
1. Fix reconstructPreparedTransactions to explicitly set read source as kNoTimestamp so both the transactions table read and the oplog read would be untimestamped.
2. Fix ReplicationCoordinatorImpl::_setMyLastAppliedOpTimeAndWallTime to reset localSnapshot properly even if the given OpTime is 0. (i.e. Moving the if statement to after updateLocalSnapshot.

And I think maybe we should do both.



 Comments   
Comment by Githook User [ 21/Jun/19 ]

Author:

{'name': 'Lingzhi Deng', 'email': 'lingzhi.deng@mongodb.com', 'username': 'ldennis'}

Message: SERVER-41779: reconstructPreparedTransactions should use readSource kNoTimestamp

(cherry picked from commit f1dcaea4a97903fa7c785f31c55f484d275a5aed)
Branch: v4.2
https://github.com/mongodb/mongo/commit/a24ee3ffddee6c04d80aaf8b9ac7dc20d13a807a

Comment by Githook User [ 20/Jun/19 ]

Author:

{'name': 'Lingzhi Deng', 'email': 'lingzhi.deng@mongodb.com', 'username': 'ldennis'}

Message: SERVER-41779: reconstructPreparedTransactions should use readSource kNoTimestamp
Branch: master
https://github.com/mongodb/mongo/commit/f1dcaea4a97903fa7c785f31c55f484d275a5aed

Generated at Thu Feb 08 04:58:39 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.