Uploaded image for project: 'WiredTiger'
  1. WiredTiger
  2. WT-1307

LSM read checksum panic

    • Type: Icon: Task Task
    • Resolution: Done
    • WT2.5.1
    • Affects Version/s: None
    • Component/s: None
    • Labels:

      While attempting to reproduce WT-1306 on the AWS SSD box, my run hit this error:

      t, file:wt-000049.lsm, lsm-worker: RUNDIR/wt-000049.lsm: No such file or directory
      t, file:WiredTiger.wt, lsm-worker: read checksum error [4096B @ 24576, 3103498970 != 4049241959]
      t, file:WiredTiger.wt, lsm-worker: WiredTiger.wt: encountered an illegal file format or internal value
      t, file:WiredTiger.wt, lsm-worker: aborting WiredTiger library
      

      Here's the stack. I built with clang so lots of stuff appears optimized out:

      Thread 3 (Thread 0x7ff0b09ef700 (LWP 15449)):
      #0  0x00007ff0b884b993 in select () from /lib64/libc.so.6
      WT-1  0x0000000000628716 in __wt_sleep (seconds=<optimized out>, micro_seconds=<optimized out>) at ../src/os_posix/os_sleep.c:22
      WT-2  0x0000000000706124 in __wt_attach (session=<optimized out>) at ../src/support/global.c:113
      WT-3  0x0000000000b1b7de in __wt_abort (session=<optimized out>) at ../src/os_posix/os_abort.c:21
      WT-4  0x00000000007032bf in __wt_illegal_value (session=<optimized out>, name=<optimized out>) at ../src/support/err.c:492
      WT-5  0x0000000000cd63b4 in __wt_block_read_off (session=<optimized out>, block=<optimized out>, buf=<optimized out>, offset=<optimized out>, 
          size=<optimized out>, cksum=<optimized out>) at ../src/block/block_read.c:210
      WT-6  0x0000000000d24225 in __wt_block_extlist_read (session=<optimized out>, block=<optimized out>, el=<optimized out>, ckpt_size=<optimized out>)
          at ../src/block/block_ext.c:1140
      WT-7  0x0000000000d01288 in __ckpt_extlist_read (session=<optimized out>, block=<optimized out>, ckpt=<optimized out>) at ../src/block/block_ckpt.c:284
      WT-8  0x0000000000cfbef5 in __ckpt_process (session=<optimized out>, block=<optimized out>, ckptbase=<optimized out>) at ../src/block/block_ckpt.c:444
      WT-9  0x0000000000cfa732 in __wt_block_checkpoint (session=<optimized out>, block=<optimized out>, buf=<optimized out>, ckptbase=<optimized out>, 
          data_cksum=<optimized out>) at ../src/block/block_ckpt.c:250
      WT-10 0x0000000000cd057a in __bm_checkpoint (bm=<optimized out>, session=<optimized out>, buf=<optimized out>, ckptbase=<optimized out>, 
          data_cksum=<optimized out>) at ../src/block/block_mgr.c:65
      WT-11 0x0000000000c31582 in __wt_bt_write (session=<optimized out>, buf=<optimized out>, addr=<optimized out>, addr_sizep=<optimized out>, 
          checkpoint=<optimized out>, compressed=<optimized out>) at ../src/btree/bt_io.c:292
      WT-12 0x00000000008cacf2 in __rec_write_wrapup (session=<optimized out>, r=<optimized out>, page=<optimized out>) at ../src/btree/rec_write.c:4800
      WT-13 0x00000000008a92ce in __wt_rec_write (session=<optimized out>, ref=<optimized out>, salvage=<optimized out>, flags=<optimized out>)
          at ../src/btree/rec_write.c:418
      WT-14 0x0000000000834037 in __sync_file (session=<optimized out>, syncop=<optimized out>) at ../src/btree/bt_sync.c:135
      WT-15 0x0000000000831ebd in __wt_cache_op (session=<optimized out>, ckptbase=<optimized out>, op=<optimized out>) at ../src/btree/bt_sync.c:344
      WT-16 0x0000000000746355 in __checkpoint_worker (session=<optimized out>, cfg=<optimized out>, is_checkpoint=<optimized out>) at ../src/txn/txn_ckpt.c:832
      WT-17 0x0000000000740e54 in __wt_checkpoint (session=<optimized out>, cfg=<optimized out>) at ../src/txn/txn_ckpt.c:897
      WT-18 0x00000000005f00eb in __wt_meta_track_off (session=<optimized out>, unroll=<optimized out>) at ../src/meta/meta_track.c:222
      WT-19 0x000000000067eb5b in __wt_schema_drop (session=<optimized out>, uri=<optimized out>, cfg=<optimized out>) at ../src/schema/schema_drop.c:201
      WT-20 0x0000000000aea68f in __wt_lsm_merge (session=<optimized out>, lsm_tree=<optimized out>, id=<optimized out>) at ../src/lsm/lsm_merge.c:468
      WT-21 0x00000000005e3b60 in __lsm_worker (arg=<optimized out>) at ../src/lsm/lsm_worker.c:137
      WT-22 0x000000000046a294 in ThreadStart () at /home/sue/llvm/projects/compiler-rt/lib/asan/asan_thread.cc:167
      WT-23 0x00007ff0b8f37f18 in start_thread () from /lib64/libpthread.so.0
      WT-24 0x00007ff0b8852b9d in clone () from /lib64/libc.so.6
      

      No other threads are doing anything interesting. Here's the CONFIG:

      ############################################
      auto_throttle=1
      firstfit=0
      bitcnt=8
      bloom=1
      bloom_bit_count=24
      bloom_hash_count=6
      bloom_oldest=0
      cache=240
      checkpoints=1
      checksum=uncompressed
      chunk_size=8
      compaction=1
      compression=none
      data_extend=0
      data_source=lsm
      delete_pct=41
      dictionary=0
      evict_max=2
      file_type=row-store
      backups=0
      huffman_key=0
      huffman_value=0
      insert_pct=85
      internal_key_truncation=1
      internal_page_max=9
      isolation=random
      key_gap=9
      key_max=256
      key_min=256
      leak_memory=0
      leaf_page_max=9
      logging=0
      lsm_worker_threads=3
      merge_max=19
      mmap=1
      ops=100000
      prefix_compression=1
      prefix_compression_min=8
      repeat_data_pct=2
      reverse=0
      rows=100000
      runs=1000
      split_pct=52
      statistics=1
      threads=9
      value_max=3294
      value_min=256
      wiredtiger_config=
      write_pct=15
      ############################################
      

            Assignee:
            keith.bostic@mongodb.com Keith Bostic (Inactive)
            Reporter:
            sue.loverso@mongodb.com Susan LoVerso
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

              Created:
              Updated:
              Resolved: