[SERVER-46721] Step up may cause reads at PIT with holes after yielding Created: 09/Mar/20 Updated: 29/Oct/23 Resolved: 12/May/20 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Concurrency |
| Affects Version/s: | None |
| Fix Version/s: | 4.4.0-rc7, 4.2.9, 4.7.0 |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Geert Bosch | Assignee: | Louis Williams |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||||||||||||||||||||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||||||||||||||||||||||||||||||
| Operating System: | ALL | ||||||||||||||||||||||||||||||||||||
| Backport Requested: |
v4.4, v4.2, v4.0
|
||||||||||||||||||||||||||||||||||||
| Sprint: | Execution Team 2020-03-23, Execution Team 2020-04-06, Execution Team 2020-05-04, Execution Team 2020-05-18 | ||||||||||||||||||||||||||||||||||||
| Participants: | |||||||||||||||||||||||||||||||||||||
| Case: | (copied to CRM) | ||||||||||||||||||||||||||||||||||||
| Linked BF Score: | 0 | ||||||||||||||||||||||||||||||||||||
| Description |
|
When we acquire collection locks with AutoGetCollectionForRead on a secondary, we set the TimestampReadSource on the recovery unit to kLastApplied if we were a secondary at the time. However, when we re-acquire locks after a yield in PlanExecutor, we use the same TimestampReadSource we had when we originally acquired those locks, even we've become a primary since. This is bad for the oplog specifically, because on a primary, the last applied time is not oplog-hole-free, meaning we can read past oplog holes. More in general, we should not read at a PIT if there are still write transactions that may commit at that PIT. Suggested fix: Have secondary reads use the kNoOverlap read source instead of kLastApplied, which is std::min(*lastApplied, allDurable). This is the same as kLastApplied while a node is remains secondary, but is equivalent to kAllDurableSnapshot after transition to primary. |
| Comments |
| Comment by Githook User [ 06/Aug/20 ] |
|
Author: {'name': 'Louis Williams', 'email': 'louis.williams@mongodb.com', 'username': 'louiswilliams'}Message: The no-overlap time, ReadSource::kNoOverlap, is the minimum of replication's lastApplied timestamp (cherry picked from commit 25c694f365db0f07a445bd17b6cd5cbf32f5f2f9)
After yielding and reacquiring locks in a query, the preconditions that were used to select our (cherry picked from commit b3a5b5258a487006f0487b9e7e0a1d0d4f1119ff)
This partially reverts work to use the kNoOverlap ReadSource on secondaries since the all_durable (cherry picked from commit ff92d4435fa75c2b947c49cffaf48805b320a5ae) |
| Comment by Louis Williams [ 09/Jun/20 ] |
|
Due to a small performance regression, I have partially reverted this work in |
| Comment by Githook User [ 19/May/20 ] |
|
Author: {'name': 'Louis Williams', 'email': 'louis.williams@mongodb.com', 'username': 'louiswilliams'}Message: The no-overlap time, ReadSource::kNoOverlap, is the minimum of replication's lastApplied timestamp (cherry picked from commit 25c694f365db0f07a445bd17b6cd5cbf32f5f2f9) |
| Comment by Louis Williams [ 12/May/20 ] |
|
I am declining the 4.0 backport. We have not observed any build failures on the 4.0 branch. In addition, it does not have the kNoOverlap ReadSource code on which this patch depends. Without having observed any problems, it seems very risky to attempt to implement new behavior and be confident in its correctness on a relatively stable version. |
| Comment by Githook User [ 12/May/20 ] |
|
Author: {'name': 'Louis Williams', 'email': 'louis.williams@mongodb.com', 'username': 'louiswilliams'}Message: The no-overlap time, ReadSource::kNoOverlap, is the minimum of replication's lastApplied timestamp |