Uploaded image for project: 'WiredTiger'
  1. WiredTiger
  2. WT-6637

Log recovery is nontimestamped and can overwrite some of the records in the checkpoint

    • Storage - Ra 2020-09-21

      With durable history, checkpoint aims to preserve a window of data from the oldest timestamp to the stable timestamp instead of just a snapshot of data at the stable timestamp.

      However, the logging in WiredTiger still only aims to preserve a snapshot at the last successful commit.

      This creates issues when recovering logged tables with timestamps.

      The following sequence happens in the failure of logged table in the wt6616-checkpoint-oldest-ts test written for WT-6616:

      Checkpoint starts with txnid 10

      We insert a key at timestamp 20 with txnid 11

      We delete the key at timestamp 21 with txnid 12

      Checkpoint refreshes its snapshot and get its stable timestamp 22 and oldest timestamp 10

      Checkpoint writes the key with time window (20, 21) to the disk

      Checkpoint finishes

      We crash and restart the database to run recovery

      Recovery replays transactions from transaction id 11 without timestamps

      Therefore, the key is removed from the database as transaction 12 deletes it without timestamp.

      After recovery, we can no longer find the key.

      I think the root problem is that logging doesn't preserve any timestamp information so that all the data deleted after the oldest timestamp are lost.

            Assignee:
            keith.bostic@mongodb.com Keith Bostic (Inactive)
            Reporter:
            chenhao.qu@mongodb.com Chenhao Qu
            Votes:
            0 Vote for this issue
            Watchers:
            10 Start watching this issue

              Created:
              Updated:
              Resolved: