Loading...

XML

Word

Printable

JSON

Type: Bug
Resolution: Fixed
Priority: Critical - P2
Fix Version/s: WT11.0.0, 6.0.0-rc0
Affects Version/s: None
Component/s: None
Labels:
- bug-classification-activity-phase-2
- group-d

Sprint:
None
Story Points:
None

Backport Requested:

v5.3

        /* Ignore prepared updates if it is checkpoint. */
        if (upd->prepare_state == WT_PREPARE_LOCKED ||
          upd->prepare_state == WT_PREPARE_INPROGRESS) {
            WT_ASSERT(session, upd_select->upd == NULL || upd_select->upd->txnid == upd->txnid);
            if (F_ISSET(r, WT_REC_CHECKPOINT)) {
                has_newer_updates = true;
                if (upd->start_ts > max_ts)
                    max_ts = upd->start_ts;

                /*
                 * Track the oldest update not on the page, used to decide whether reads can use the
                 * page image, hence using the start rather than the durable timestamp.
                 */
                if (upd->start_ts < r->min_skipped_ts)
                    r->min_skipped_ts = upd->start_ts;
                continue;
            } else {
                /*
                 * For prepared updates written to the date store in salvage, we write the same
                 * prepared value to the date store. If there is still content for that key left in
                 * the history store, rollback to stable will bring it back to the data store.
                 * Otherwise, it removes the key.
                 */
                WT_ASSERT(session,
                  F_ISSET(r, WT_REC_EVICT) ||
                    (F_ISSET(r, WT_REC_VISIBILITY_ERR) &&
                      F_ISSET(upd, WT_UPDATE_PREPARE_RESTORED_FROM_DS)));
                WT_ASSERT(session, upd->prepare_state == WT_PREPARE_INPROGRESS);
            }

With the current implementation, checkpoint may see partial resolved prepared updates on the same key and write that to disk.

The detailed scenario is like follow:

Suppose we have the update chain like U_prepared2@10 -> U_prepared1@10

Checkpoint starts

We commit the prepared update and resolve the U_preapred2 to U_committed@11_durable@12.

Context switch happens and we have U_committed@11_durable@12 -> U_prepared1@10 on the update chain.

Checkpoint comes to the page and sees U_committed@11_durable@12 and decide to write it to the disk image.

Checkpoint sees U_prepared1@10 and set has_newer_updates to true but never unsets the update that should be written to disk (U_committed@11_durable@12).

In this case, we write U_committed@11_durable@12 to the data store and U_prepared1@10 to the history store, which is wrong.

causes

WT-11186 Restore ignore_prepare semantics to read with read_committed isolation instead of read_uncommitted

Closed

related to

WT-11195 Investigate allowing the rec update select loop to forget a seen update

Closed

Assignee:: Keith Bostic (Inactive)
Reporter:: Chenhao Qu
Votes:: 0 Vote for this issue
Watchers:: 12 Start watching this issue

Created:: Oct 07 2020 11:53:52 PM UTC
Updated:: Oct 29 2023 04:42:53 PM UTC
Resolved:: Mar 24 2022 11:32:38 PM UTC

Details

Description

Attachments

Issue Links

Forms

Activity

People

Dates