Loading...

XML

Word

Printable

JSON

Type: Bug
Resolution: Fixed
Priority: Minor - P4
Fix Version/s: WT11.0.0, 6.1.0-rc0
Affects Version/s: None
Component/s: None
Labels:
None

Sprint:
None
Story Points:
None

Even after ~~WT-9367~~ there's still another race that can occur if trying to open a checkpoint cursor while a checkpoint is finishing. The steps are:

1. <- ds checkpoint completes
2. read ds metadata ->
3. read hs metadata ->
4. <- hs checkpoint completes
5. <- write stable
6. <- write oldest
7. <- write snapshot
8. read snapshot ->
9. read stable ->
10. read oldest ->

...which is, I'm afraid, much like the ~~WT-9367~~ issue except for the history store tree instead of the stable timestamp.

For those following along at home, the reason this is hard is that there are five things we have to read, all atomically, and there isn't anything we can usefully/correctly lock to get at them all at once... plus any or all of them besides the snapshot might be skipped over and not actually updated by the running checkpoint. The above scenario can't be distinguished from a correct run where the history store checkpoint was skipped without further input.

I think the solution is to read the snapshot twice (first and last, around everything else) and retry if the checkpoint wall time associated with it isn't the same both times, as well as the current logic that checks if any of the elements are newer than the snapshot. That way, if a concurrently running checkpoint updates some of the items we read but not the snapshot, we'll see they're newer and retry; and if it also updates the snapshot, the two snapshot times won't match. So if there is such a checkpoint (that didn't finish and update the snapshot before we started) it can't update any of the items before we read them without triggering a retry.

However, I'm not yet completely convinced this is correct; the previous couple versions have also had plausible correctness arguments that have turned out to contain holes.

Unfortunately, the only way these problems manifest is with rare mismatches in format-mirror, so testing doesn't produce large amounts of confidence either...

Assignee:: Keith Bostic (Inactive)
Reporter:: David Holland
Votes:: 0 Vote for this issue
Watchers:: 4 Start watching this issue

Created:: Jun 15 2022 12:08:02 AM UTC
Updated:: Oct 29 2023 04:39:26 PM UTC
Resolved:: Jun 16 2022 02:25:25 AM UTC

Details

Description

Attachments

Forms

Activity

People

Dates