Uploaded image for project: 'WiredTiger'
  1. WiredTiger
  2. WT-7999

Fix the assert to handle an update in the middle with max stop timestamp

    • 3
    • Storage - Ra 2021-09-06

      The assert __rollback_ondisk_fixup_key, 455: hs_stop_durable_ts <= newer_hs_durable_ts || hs_start_ts == hs_stop_durable_ts || hs_start_ts == newer_hs_durable_ts || first_record fires in recovery in test/checkpoont.

      The core dump gives:

      (gdb) p hs_stop_durable_ts
      $1 = 18446744073709551615
      (gdb) p unpack.tw
      $2 = {durable_start_ts = 1074, start_ts = 1074, start_txn = 20762, durable_stop_ts = 1074, stop_ts = 1074, stop_txn = 20762, prepare = 0 '\000'}
      (gdb) p newer_hs_durable_ts
      $3 = 1074

      The newer_hs_durable_ts has the same timestamp as the onpage value.

      The root cause is a race of committing prepared update and checkpoint.

      The sequence is:

      The user thread commits a prepared update.
      It marks the update as resolved.
      Another user thread add an update to the key.
      Checkpoint writes the new update to the data store and the just committed prepared update to the history store.
      Checkpoint checkpoints the history store with the update older than the prepared update still with a max timestamp.
      We go and fix the max timestamp only after checkpoint has visited that history store page.

            haribabu.kommi@mongodb.com Haribabu Kommi
            chenhao.qu@mongodb.com Chenhao Qu
            0 Vote for this issue
            3 Start watching this issue