Uploaded image for project: 'WiredTiger'
  1. WiredTiger
  2. WT-4715

Workloads will stall if old transaction or timestamp pinned by thread co-opted for eviction

    • 8
    • Storage Engines 2019-09-09, Storage Engines 2019-09-23, Storage Engines 2019-10-07

      We encountered a cache stuck with clean page failure in the test/format stress sanitiser job. Please note timestamp was disabled in the test. We had WT-4608 and WT-3694 created to track LSM cache stuck clean failures. This case is likely the 1st time we see the btree cache stuck with clean pages failure.

      http://build.wiredtiger.com:8080/job/wiredtiger-test-format-stress-sanitizer/23878/

      [1554809849:456697][2330:0x7fb095f0b700], t, eviction-server: __evict_server, 451: Cache stuck for too long, giving up: Connection timed out
      t: process 2330
      0x7fb09f590488:=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
      0x7fb09f590488:transaction state dump
      0x7fb09f590488:current ID: 4647
      0x7fb09f590488:last running ID: 4647
      0x7fb09f590488:metadata_pinned ID: 4370
      0x7fb09f590488:oldest ID: 4370
      0x7fb09f590488:commit timestamp: (0,0)
      0x7fb09f590488:oldest timestamp: (0,0)
      0x7fb09f590488:pinned timestamp: (0,0)
      0x7fb09f590488:stable timestamp: (0,0)
      0x7fb09f590488:has_commit_timestamp: no
      0x7fb09f590488:has_oldest_timestamp: no
      0x7fb09f590488:has_pinned_timestamp: no
      0x7fb09f590488:has_stable_timestamp: no
      0x7fb09f590488:oldest_is_pinned: no
      0x7fb09f590488:stable_is_pinned: no
      0x7fb09f590488:checkpoint running: no
      0x7fb09f590488:checkpoint generation: 18
      0x7fb09f590488:checkpoint pinned ID: 0
      0x7fb09f590488:checkpoint txn ID: 0
      0x7fb09f590488:oldest named snapshot ID: 0
      0x7fb09f590488:session count: 36
      0x7fb09f590488:Transaction state of active sessions:
      0x7fb09f590488:ID: 0, pinned ID: 4491, metadata pinned ID: 0, name: WT_CURSOR.prev
      0x7fb09f590488:mod count: 0, snap min: 4491, snap max: 4495, commit_timestamp: (0,0), durable_timestamp: (0,0), first_commit_timestamp: (0,0), read_timestamp: (0,0), flags: 0x00004008, isolation: WT_ISO_SNAPSHOT
      0x7fb09f590488:ID: 0, pinned ID: 4491, metadata pinned ID: 0, name: WT_CURSOR.prev
      0x7fb09f590488:mod count: 0, snap min: 4491, snap max: 4503, commit_timestamp: (0,0), durable_timestamp: (0,0), first_commit_timestamp: (0,0), read_timestamp: (0,0), flags: 0x00000008, isolation: WT_ISO_READ_COMMITTED
      0x7fb09f590488:ID: 0, pinned ID: 4491, metadata pinned ID: 0, name: WT_CURSOR.next
      0x7fb09f590488:mod count: 0, snap min: 4491, snap max: 4504, commit_timestamp: (0,0), durable_timestamp: (0,0), first_commit_timestamp: (0,0), read_timestamp: (0,0), flags: 0x00000008, isolation: WT_ISO_READ_COMMITTED
      0x7fb09f590488:ID: 0, pinned ID: 4370, metadata pinned ID: 0, name: WT_CURSOR.search
      0x7fb09f590488:mod count: 0, snap min: 4370, snap max: 4375, commit_timestamp: (0,0), durable_timestamp: (0,0), first_commit_timestamp: (0,0), read_timestamp: (0,0), flags: 0x00004008, isolation: WT_ISO_SNAPSHOT
      0x7fb09f590488:ID: 0, pinned ID: 4491, metadata pinned ID: 0, name: WT_CURSOR.next
      0x7fb09f590488:mod count: 0, snap min: 4491, snap max: 4503, commit_timestamp: (0,0), durable_timestamp: (0,0), first_commit_timestamp: (0,0), read_timestamp: (0,0), flags: 0x00000008, isolation: WT_ISO_READ_COMMITTED
      0x7fb09f590488:ID: 0, pinned ID: 4491, metadata pinned ID: 0, name: WT_CURSOR.next
      0x7fb09f590488:mod count: 0, snap min: 4491, snap max: 4503, commit_timestamp: (0,0), durable_timestamp: (0,0), first_commit_timestamp: (0,0), read_timestamp: (0,0), flags: 0x00000008, isolation: WT_ISO_READ_COMMITTED
      0x7fb09f590488:=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
      0x7fb09f590488:cache dump
      0x7fb09f590488:cache full: no
      0x7fb09f590488:cache clean check: yes (97.548%)
      0x7fb09f590488:cache dirty check: no (0.091%)
      0x7fb09f590488:file:wt.wt(checkpoint=WiredTigerCheckpoint.1):
      0x7fb09f590488:internal: 3 pages, 0MB, 3/0 clean/dirty pages, 0/0 clean/dirty MB, 0MB max page, 0MB max dirty page
      0x7fb09f590488:leaf: 1 pages, 0MB, 1/0 clean/dirty pages, 0/0 clean/dirty MB, 0MB max page, 0MB max dirty page
      0x7fb09f590488:file:wt.wt(<live>):
      0x7fb09f590488:internal: 64 pages, 24MB, 64/0 clean/dirty pages, 24/0 clean/dirty MB, 0MB max page, 0MB max dirty page
      0x7fb09f590488:leaf: 4 pages, 0MB, 4/0 clean/dirty pages, 0/0 clean/dirty MB, 0MB max page, 0MB max dirty page
      0x7fb09f590488:file:WiredTigerLAS.wt(<live>):
      0x7fb09f590488:internal: 1 pages, 0MB, 0/1 clean/dirty pages, 0/0 clean/dirty MB, 0MB max page, 0MB max dirty page
      0x7fb09f590488:leaf: 0 pages
      0x7fb09f590488:file:WiredTiger.wt(<live>):
      0x7fb09f590488:internal: 1 pages, 0MB, 1/0 clean/dirty pages, 0/0 clean/dirty MB, 0MB max page, 0MB max dirty page
      0x7fb09f590488:leaf: 1 pages, 0MB, 0/1 clean/dirty pages, 0/0 clean/dirty MB, 0MB max page, 0MB max dirty page
      0x7fb09f590488:cache dump: total found: 27MB vs tracked inuse 27MB
      0x7fb09f590488:total dirty bytes: 0MB
      [1554809849:462413][2330:0x7fb095f0b700], t, eviction-server: __wt_evict_thread_run, 321: cache eviction thread error: Connection timed out
      [1554809849:462425][2330:0x7fb095f0b700], t, eviction-server: __wt_panic, 520: the process must exit and restart: WT_PANIC: WiredTiger library panic
      [1554809849:462430][2330:0x7fb095f0b700], t, eviction-server: __wt_abort, 30: aborting WiredTiger library
      /tmp/jenkins5083747742990501913.sh: line 24:  2330 Aborted                 nice ./t -1 -c ../../../test/format/CONFIG.stress
      + cleanup
      + status=134 

      The configuration:

      ############################################
      #  RUN PARAMETERS
      ############################################
      abort=0
      alter=0
      auto_throttle=1
      backups=1
      bitcnt=8
      bloom=1
      bloom_bit_count=4
      bloom_hash_count=32
      bloom_oldest=0
      cache=28
      cache_minimum=20
      checkpoints=on
      checkpoint_log_size=44
      checkpoint_wait=100
      checksum=uncompressed
      chunk_size=4
      compaction=0
      compression=none
      data_extend=0
      data_source=table
      delete_pct=37
      dictionary=0
      direct_io=0
      encryption=rotn-7
      evict_max=0
      file_type=row-store
      firstfit=0
      huffman_key=0
      huffman_value=0
      independent_thread_rng=1
      in_memory=0
      insert_pct=0
      internal_key_truncation=1
      internal_page_max=16
      isolation=snapshot
      key_gap=16
      key_max=79
      key_min=18
      leaf_page_max=9
      leak_memory=0
      logging=1
      logging_archive=1
      logging_compression=zlib
      logging_file_max=416851
      logging_prealloc=1
      long_running_txn=0
      lsm_worker_threads=4
      memory_page_max=1
      merge_max=10
      mmap=1
      modify_pct=17
      ops=0
      prefix_compression=1
      prefix_compression_min=7
      prepare=0
      quiet=1
      read_pct=45
      rebalance=1
      repeat_data_pct=40
      reverse=0
      rows=1000000
      runs=1
      salvage=1
      split_pct=75
      statistics=0
      statistics_server=1
      threads=14
      timer=4
      timing_stress_aggressive_sweep=0
      timing_stress_checkpoint=0
      timing_stress_lookaside_sweep=0
      timing_stress_split_1=0
      timing_stress_split_2=0
      timing_stress_split_3=0
      timing_stress_split_4=0
      timing_stress_split_5=0
      timing_stress_split_6=0
      timing_stress_split_7=0
      timing_stress_split_8=0
      transaction_timestamps=0
      transaction-frequency=94
      truncate=1
      value_max=1148
      value_min=17
      verify=1
      wiredtiger_config=
      write_pct=1
      ############################################ 

            Assignee:
            keith.bostic@mongodb.com Keith Bostic (Inactive)
            Reporter:
            luke.chen@mongodb.com Luke Chen
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

              Created:
              Updated:
              Resolved: