Uploaded image for project: 'WiredTiger'
  1. WiredTiger
  2. WT-7020

Assert the previous update timestamp of a key

    XMLWordPrintable

    Details

    • Type: Task
    • Status: Closed
    • Priority: Major - P3
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: WT10.0.0, 4.9.0, 4.4.4
    • Component/s: None
    • Labels:
      None
    • Story Points:
      3
    • Sprint:
      Storage - Ra 2021-01-25

      Description

      As part of WT-6861, we added an assert that looked like this:

              WT_ASSERT(
                session, upd->durable_ts >= upd->start_ts || upd->prepare_state == WT_PREPARE_INPROGRESS);
      

      This was getting hit frequently in test/format. If I add some testing code, I can see this in the debugger:

      (lldb) f 3
      frame #3: 0x00000000005adb51 t`__wt_txn_update_check(session=0x0000000000931620, cbt=0x00007fca5c144c10, upd=0x00007fca000b4150, prev_tsp=0x00007fca92ffcb98) at txn_inline.h:1301
         1298	         */
         1299	        wt_timestamp_t dts = upd->durable_ts;
         1300	        wt_timestamp_t sts = upd->start_ts;
      -> 1301	        WT_ASSERT(session, dts >= sts || upd->prepare_state == WT_PREPARE_INPROGRESS);
         1302	        /* WT_ASSERT( */
         1303	        /*   session, upd->durable_ts >= upd->start_ts || upd->prepare_state == WT_PREPARE_INPROGRESS); */
         1304	        *prev_tsp = upd->durable_ts;
      (lldb) p dts
      (wt_timestamp_t) $0 = 0
      (lldb) p sts
      (wt_timestamp_t) $1 = 192360
      (lldb) p *upd
      (WT_UPDATE) $2 = {
        txnid = 133636
        durable_ts = 192360
        start_ts = 192360
        prev_durable_ts = 0
        next = 0x00007fca0806cb00
        size = 32
        type = '\x03'
        prepare_state = '\0'
        flags = '\0'
        data = {}
      }
      

      So it seems that when upd->durable_ts got copied, it was 0 but soon after, it gets written over to be identical to the upd->start_ts. Once an update gets published to the chain, it shouldn't be getting written to like this so I'm thinking it may be a missing write barrier somewhere.

      It's not hard to reproduce but here's a sample configuration:

      ############################################
      #  RUN PARAMETERS: V2
      ############################################
      assert.commit_timestamp=0
      assert.read_timestamp=0
      backup=0
      backup.incremental=off
      backup.incr_granularity=4603
      btree.bitcnt=5
      btree.compression=none
      btree.dictionary=0
      btree.huffman_value=0
      btree.internal_key_truncation=1
      btree.internal_page_max=10
      btree.key_gap=1
      btree.key_max=114
      btree.key_min=21
      btree.leaf_page_max=15
      btree.memory_page_max=5
      btree.prefix_compression=1
      btree.prefix_compression_min=5
      btree.repeat_data_pct=67
      btree.reverse=0
      btree.split_pct=85
      btree.value_max=2623
      btree.value_min=16
      cache=320
      cache.evict_max=5
      cache.minimum=20
      checkpoint=on
      checkpoint.log_size=109
      checkpoint.wait=95
      disk.checksum=uncompressed
      disk.data_extend=1
      disk.direct_io=1
      disk.encryption=none
      disk.firstfit=0
      disk.mmap=1
      disk.mmap_all=0
      format.abort=0
      format.independent_thread_rng=0
      format.major_timeout=0
      logging=0
      logging.archive=0
      logging.compression=none
      logging.file_max=183984
      logging.prealloc=0
      lsm.auto_throttle=1
      lsm.bloom=1
      lsm.bloom_bit_count=48
      lsm.bloom_hash_count=19
      lsm.bloom_oldest=0
      lsm.chunk_size=6
      lsm.merge_max=19
      lsm.worker_threads=3
      ops.alter=0
      ops.compaction=0
      ops.hs_cursor=0
      ops.pct.delete=2
      ops.pct.insert=92
      ops.pct.modify=2
      ops.pct.read=0
      ops.pct.write=4
      ops.prepare=0
      ops.random_cursor=0
      ops.salvage=0
      ops.truncate=1
      ops.verify=1
      quiet=1
      runs=1
      runs.in_memory=0
      runs.ops=0
      runs.rows=2600000
      runs.source=table
      runs.threads=32
      runs.timer=6
      runs.type=row-store
      runs.verify_failure_dump=0
      statistics=0
      statistics.server=0
      stress.aggressive_sweep=0
      stress.checkpoint=0
      stress.checkpoint_prepare=0
      stress.hs_checkpoint_delay=0
      stress.hs_search=0
      stress.hs_sweep=0
      stress.split_1=0
      stress.split_2=1
      stress.split_3=0
      stress.split_4=0
      stress.split_5=0
      stress.split_6=0
      stress.split_7=0
      stress.split_8=0
      transaction.frequency=100
      transaction.isolation=snapshot
      transaction.rollback_to_stable=0
      transaction.timestamps=1
      wiredtiger.config=
      wiredtiger.rwlock=1
      wiredtiger.leak_memory=0
      ############################################
      

      One thing to note is that ops.prepare=0. There is some suspect looking code in the prepare code path but that shouldn't be a factor here.

        Attachments

          Activity

            People

            Assignee:
            haribabu.kommi Haribabu Kommi
            Reporter:
            alex.cameron Alex Cameron (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

              Dates

              Created:
              Updated:
              Resolved: