Uploaded image for project: 'WiredTiger'
  1. WiredTiger
  2. WT-3799

Test/format with timestamps enabled pin cache full

    • Type: Icon: Bug Bug
    • Resolution: Fixed
    • Priority: Icon: Major - P3 Major - P3
    • 3.6.3, 3.7.2, WT3.1.0
    • Affects Version/s: None
    • Component/s: None
    • Labels:
      None
    • Storage 2017-12-18, Storage 2018-01-01, Storage 2018-01-15

      We are seeing a lot of failed stress test runs at the moment because enabling timestamps in test/format often results in the cache being pinned full.

      We need to investigate whether we can reasonably avoid the cache being pinned full, and if not update test/format to avoid encountering the condition.

      An example of a failure:

      [1512735814:255349][22179:0x7f3b7d224700], t, eviction-server: Cache stuck for too long, giving up: Connection timed out
      0xe341c0:=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
      0xe341c0:transaction state dump
      0xe341c0:current ID: 72453
      0xe341c0:last running ID: 72453
      0xe341c0:oldest ID: 72453
      0xe341c0:commit timestamp: aed7c
      0xe341c0:oldest timestamp: ac902
      0xe341c0:pinned timestamp: ac902
      0xe341c0:stable timestamp: 0
      0xe341c0:has_commit_timestamp: yes
      0xe341c0:has_oldest_timestamp: yes
      0xe341c0:has_pinned_timestamp: yes
      0xe341c0:has_stable_timestamp: no
      0xe341c0:oldest_is_pinned: yes
      0xe341c0:stable_is_pinned: no
      0xe341c0:checkpoint running: no
      0xe341c0:checkpoint generation: 228
      0xe341c0:checkpoint pinned ID: 0
      0xe341c0:checkpoint txn ID: 0
      0xe341c0:oldest named snapshot ID: 0
      0xe341c0:session count: 35
      0xe341c0:Transaction state of active sessions:
      0xe341c0:=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
      0xe341c0:cache dump
      0xe341c0:cache full: no
      0xe341c0:cache clean check: no (0%)
      0xe341c0:cache dirty check: yes (50%)
      0xe341c0:file:wt(<live>):
      0xe341c0:internal: 1 pages, 0MB, 0/1 clean/dirty pages, 0/0 clean/dirty MB, 0MB max page, 0MB max dirty page
      0xe341c0:leaf: 310 pages, 15MB, 168/142 clean/dirty pages, 9/6 clean/dirty MB, 0MB max page, 0MB max dirty page
      0xe341c0:file:WiredTigerLAS.wt(<live>):
      0xe341c0:internal: 1 pages, 0MB, 0/1 clean/dirty pages, 0/0 clean/dirty MB, 0MB max page, 0MB max dirty page
      0xe341c0:file:WiredTiger.wt(<live>):
      0xe341c0:internal: 1 pages, 0MB, 1/0 clean/dirty pages, 0/0 clean/dirty MB, 0MB max page, 0MB max dirty page
      0xe341c0:leaf: 1 pages, 0MB, 1/0 clean/dirty pages, 0/0 clean/dirty MB, 0MB max page, 0MB max dirty page
      0xe341c0:cache dump: total found: 17MB vs tracked inuse 18MB
      0xe341c0:total dirty bytes: 6MB
      [1512735814:255734][22179:0x7f3b7d224700], t, eviction-server: cache eviction thread error: Connection timed out
      [1512735814:255741][22179:0x7f3b7d224700], t, eviction-server: the process must exit and restart: WT_PANIC: WiredTiger library panic
      [1512735814:255746][22179:0x7f3b7d224700], t, eviction-server: aborting WiredTiger library
      

      The configuration:

      ############################################
      #  RUN PARAMETERS
      ############################################
      abort=0
      alter=0
      auto_throttle=1
      backups=0
      bitcnt=7
      bloom=1
      bloom_bit_count=27
      bloom_hash_count=29
      bloom_oldest=0
      cache=37
      cache_minimum=20
      checkpoints=on
      checkpoint_log_size=59
      checkpoint_wait=94
      checksum=uncompressed
      chunk_size=10
      compaction=0
      compression=none
      data_extend=0
      data_source=file
      delete_pct=7
      dictionary=0
      direct_io=0
      encryption=none
      evict_max=5
      file_type=row-store
      firstfit=0
      huffman_key=0
      huffman_value=0
      independent_thread_rng=1
      in_memory=0
      insert_pct=12
      internal_key_truncation=1
      internal_page_max=16
      isolation=random
      key_gap=6
      key_max=115
      key_min=21
      leaf_page_max=17
      leak_memory=0
      logging=0
      logging_archive=0
      logging_compression=none
      logging_file_max=411512
      logging_prealloc=1
      long_running_txn=0
      lsm_worker_threads=3
      merge_max=6
      mmap=1
      modify_pct=63
      ops=100000
      prefix_compression=1
      prefix_compression_min=3
      quiet=1
      read_pct=5
      rebalance=1
      repeat_data_pct=38
      reverse=0
      rows=100000
      runs=100
      salvage=1
      split_pct=82
      statistics=0
      statistics_server=0
      threads=22
      timer=360
      transaction_timestamps=1
      transaction-frequency=82
      value_max=926
      value_min=19
      verify=1
      wiredtiger_config=
      write_pct=13
      ############################################
      

      The original failure:
      http://build.wiredtiger.com:8080/job/wiredtiger-test-format-stress/59888

            Assignee:
            michael.cahill@mongodb.com Michael Cahill (Inactive)
            Reporter:
            alexander.gorrod@mongodb.com Alexander Gorrod
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

              Created:
              Updated:
              Resolved: