Uploaded image for project: 'WiredTiger'
  1. WiredTiger
  2. WT-7771

Checkpoint snapshot is not honoured in __wt_rec_upd_select

    • 13
    • 2023-05-16 Chook-n-Nuts Farm

      /*
               * Check whether the update was committed before reconciliation started. The global commit
               * point can move forward during reconciliation so we use a cached copy to avoid races when
               * a concurrent transaction commits or rolls back while we are examining its updates. This
               * check is not required for history store updates as they are implicitly committed. As
               * prepared transaction IDs are globally visible, need to check the update state as well.
               *
               * If an earlier reconciliation chose this update (it is marked as being destined for the
               * data store), we should select it regardless of visibility if we haven't already selected
               * one. This is important as it is never ok to shift the on-disk value backwards in the
               * update chain.
               *
               * Also, if an earlier reconciliation performed an update-restore eviction and this update
               * was restored from disk, we can select this update irrespective of visibility. This
               * scenario can happen if the current reconciliation has a limited visibility of updates
               * compared to one of the previous reconciliations.
               */
              if (!F_ISSET(upd,
                    WT_UPDATE_DS | WT_UPDATE_PREPARE_RESTORED_FROM_DS | WT_UPDATE_RESTORED_FROM_DS) &&
                !is_hs_page &&
                (F_ISSET(r, WT_REC_VISIBLE_ALL) ? WT_TXNID_LE(r->last_running, txnid) :
                                                  !__txn_visible_id(session, txnid)))
      

      In __wt_rec_upd_select, we don't check checkpoint's snapshot if the update has WT_UPDATE_DS | WT_UPDATE_PREPARE_RESTORED_FROM_DS | WT_UPDATE_RESTORED_FROM_DS flags set. This breaks checkpoint snapshot and writes data that checkpoint cannot see to the disk. It is OK if the update is after the stable timestamp in this case as rollback to stable will remove it. However, if the update is an update without timestamp or the timestamp is less than or equal to the stable timestamp, the consistency of the checkpoint is broken in this case.

            Assignee:
            backlog-server-storage-engines [DO NOT USE] Backlog - Storage Engines Team
            Reporter:
            chenhao.qu@mongodb.com Chenhao Qu
            Votes:
            0 Vote for this issue
            Watchers:
            14 Start watching this issue

              Created:
              Updated:
              Resolved: