[SERVER-46721] Step up may cause reads at PIT with holes after yielding Created: 09/Mar/20  Updated: 29/Oct/23  Resolved: 12/May/20

Status: Closed
Project: Core Server
Component/s: Concurrency
Affects Version/s: None
Fix Version/s: 4.4.0-rc7, 4.2.9, 4.7.0

Type: Bug Priority: Major - P3
Reporter: Geert Bosch Assignee: Louis Williams
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Backports
Depends
is depended on by SERVER-44529 Re-acquiring locks after a yield and ... Closed
Problem/Incident
Related
related to SERVER-49781 Setting lastApplied before startup re... Closed
is related to SERVER-48518 Rollback via refetch (EMRC = false) c... Closed
is related to SERVER-47084 Support AutoGetCollectionForRead-like... Closed
is related to SERVER-48475 Secondary reads always calculate all_... Closed
Backwards Compatibility: Fully Compatible
Operating System: ALL
Backport Requested:
v4.4, v4.2, v4.0
Sprint: Execution Team 2020-03-23, Execution Team 2020-04-06, Execution Team 2020-05-04, Execution Team 2020-05-18
Participants:
Case:
Linked BF Score: 0

 Description   

When we acquire collection locks with AutoGetCollectionForRead on a secondary, we set the TimestampReadSource on the recovery unit to kLastApplied if we were a secondary at the time. However, when we re-acquire locks after a yield in PlanExecutor, we use the same TimestampReadSource we had when we originally acquired those locks, even we've become a primary since. This is bad for the oplog specifically, because on a primary, the last applied time is not oplog-hole-free, meaning we can read past oplog holes. More in general, we should not read at a PIT if there are still write transactions that may commit at that PIT.

Suggested fix: Have secondary reads use the kNoOverlap read source instead of kLastApplied, which is std::min(*lastApplied, allDurable). This is the same as kLastApplied while a node is remains secondary, but is equivalent to kAllDurableSnapshot after transition to primary.



 Comments   
Comment by Githook User [ 06/Aug/20 ]

Author:

{'name': 'Louis Williams', 'email': 'louis.williams@mongodb.com', 'username': 'louiswilliams'}

Message: SERVER-46721 Secondary readers should read at the no-overlap time instead of lastApplied

The no-overlap time, ReadSource::kNoOverlap, is the minimum of replication's lastApplied timestamp
and WiredTiger's all_durable time. This time is independent of replication state and ensures
queries do not see oplog holes after state transitions from secondary to primary.

(cherry picked from commit 25c694f365db0f07a445bd17b6cd5cbf32f5f2f9)
(cherry picked from commit 5ddf9db5635e4af2d1295bd2f1007e86d517c6c2)

SERVER-44529 Query yield recovery after a stepdown should switch to read at the no-overlap time

After yielding and reacquiring locks in a query, the preconditions that were used to select our
ReadSource initially need to be checked again. Queries hold an AutoGetCollectionForRead RAII
lock for their lifetime, which may select a ReadSource based on state (e.g. replication
state). After a query yields its locks, this state may have changed, invalidating our current
choice of ReadSource.

(cherry picked from commit b3a5b5258a487006f0487b9e7e0a1d0d4f1119ff)
(cherry picked from commit ce57ddce19e642401ec1b56871b3a9402116ccfc)

SERVER-48475 Reimplement lastApplied for secondary reads

This partially reverts work to use the kNoOverlap ReadSource on secondaries since the all_durable
calculation is unnecessary and expensive.

(cherry picked from commit ff92d4435fa75c2b947c49cffaf48805b320a5ae)
(cherry picked from commit 4e3f3b96826772baa777b7563b38574c1e11c2e4)
Branch: v4.2
https://github.com/mongodb/mongo/commit/caef6619556ddeb20adf8e2bad651b96d2b3ea70

Comment by Louis Williams [ 09/Jun/20 ]

Due to a small performance regression, I have partially reverted this work in SERVER-48475. Secondary reads no longer use kNoOverlap, but use kLastApplied like they did before. Since the no-overlap point on secondaries is always equal to lastApplied, we were performing unnecessary all_durable calculations for every read.

Comment by Githook User [ 19/May/20 ]

Author:

{'name': 'Louis Williams', 'email': 'louis.williams@mongodb.com', 'username': 'louiswilliams'}

Message: SERVER-46721 Secondary readers should read at the no-overlap time instead of lastApplied

The no-overlap time, ReadSource::kNoOverlap, is the minimum of replication's lastApplied timestamp
and WiredTiger's all_durable time. This time is independent of replication state and ensures
queries do not see oplog holes after state transitions from secondary to primary.

(cherry picked from commit 25c694f365db0f07a445bd17b6cd5cbf32f5f2f9)
Branch: v4.4
https://github.com/mongodb/mongo/commit/5ddf9db5635e4af2d1295bd2f1007e86d517c6c2

Comment by Louis Williams [ 12/May/20 ]

I am declining the 4.0 backport. We have not observed any build failures on the 4.0 branch. In addition, it does not have the kNoOverlap ReadSource code on which this patch depends. Without having observed any problems, it seems very risky to attempt to implement new behavior and be confident in its correctness on a relatively stable version.

Comment by Githook User [ 12/May/20 ]

Author:

{'name': 'Louis Williams', 'email': 'louis.williams@mongodb.com', 'username': 'louiswilliams'}

Message: SERVER-46721 Secondary readers should read at the no-overlap time instead of lastApplied

The no-overlap time, ReadSource::kNoOverlap, is the minimum of replication's lastApplied timestamp
and WiredTiger's all_durable time. This time is independent of replication state and ensures
queries do not see oplog holes after state transitions from secondary to primary.
Branch: master
https://github.com/mongodb/mongo/commit/25c694f365db0f07a445bd17b6cd5cbf32f5f2f9

Generated at Thu Feb 08 05:12:15 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.