Uploaded image for project: 'WiredTiger'
  1. WiredTiger
  2. WT-3524

Test format likely to pin cache full when enabling timestamps

    • Type: Icon: Improvement Improvement
    • Resolution: Fixed
    • Priority: Icon: Major - P3 Major - P3
    • 3.6.0-rc0, WT3.0.0
    • Affects Version/s: None
    • Component/s: None
    • Labels:
      None
    • Storage 2017-09-11, Storage 2017-10-02

      The introduction of working timestamps in WiredTiger and support in test/format means that auto-generated format configuration files are occasionally pinning the cache full, meaning some configurations timeout and report an error. For example:

         3: table, row-store (9 seconds)
      [1503103466:145589][40163:0x3fff7a7ff1b0], t, eviction-server: Cache stuck for too long, giving up: Connection timed out
      0x10035609ed0:=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
      0x10035609ed0:transaction state dump
      0x10035609ed0:current ID: 19
      0x10035609ed0:last running ID: 19
      0x10035609ed0:oldest ID: 19
      0x10035609ed0:commit timestamp: b
      0x10035609ed0:oldest timestamp: 0
      0x10035609ed0:pinned timestamp: 0
      0x10035609ed0:stable timestamp: 0
      0x10035609ed0:has_commit_timestamp: yes
      0x10035609ed0:has_oldest_timestamp: no
      0x10035609ed0:has_pinned_timestamp: no
      0x10035609ed0:has_stable_timestamp: no
      0x10035609ed0:oldest_is_pinned: no
      0x10035609ed0:stable_is_pinned: no
      0x10035609ed0:checkpoint running: no
      0x10035609ed0:checkpoint generation: 1
      0x10035609ed0:checkpoint pinned ID: 0
      0x10035609ed0:checkpoint txn ID: 0
      0x10035609ed0:oldest named snapshot ID: 0
      0x10035609ed0:session count: 32
      0x10035609ed0:Transaction state of active sessions:
      0x10035609ed0:=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
      0x10035609ed0:=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
      0x10035609ed0:cache dump
      0x10035609ed0:file:wt(<live>):
      0x10035609ed0:internal: 1 pages, 0MB, 0/1 clean/dirty pages, 0/0 clean/dirty MB, 0MB max page, 0MB max dirty page
      0x10035609ed0:leaf: 40 pages, 13MB, 34/6 clean/dirty pages, 11/2 clean/dirty MB, 0MB max page, 0MB max dirty page
      0x10035609ed0:file:WiredTigerLAS.wt(<live>):
      0x10035609ed0:internal: 1 pages, 0MB, 1/0 clean/dirty pages, 0/0 clean/dirty MB, 0MB max page, 0MB max dirty page
      0x10035609ed0:file:WiredTiger.wt(<live>):
      0x10035609ed0:internal: 1 pages, 0MB, 0/1 clean/dirty pages, 0/0 clean/dirty MB, 0MB max page, 0MB max dirty page
      0x10035609ed0:cache dump: total found: 14MB vs tracked inuse 18MB
      0x10035609ed0:total dirty bytes: 2MB
      0x10035609ed0:=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
      [1503103466:146028][40163:0x3fff7a7ff1b0], t, eviction-server: cache eviction thread error: Connection timed out
      [1503103466:146033][40163:0x3fff7a7ff1b0], t, eviction-server: the process must exit and restart: WT_PANIC: WiredTiger library panic
      [1503103466:146040][40163:0x3fff7a7ff1b0], t, eviction-server: aborting WiredTiger library
      

      From the configuration file:

      ############################################
      #  RUN PARAMETERS
      ############################################
      abort=0
      alter=0
      auto_throttle=0
      backups=0
      bitcnt=5
      bloom=1
      bloom_bit_count=50
      bloom_hash_count=26
      bloom_oldest=0
      cache=31
      checkpoints=1
      checksum=uncompressed
      chunk_size=7
      compaction=0
      compatibility=none
      compression=zlib
      data_extend=0
      data_source=file
      delete_pct=2
      dictionary=0
      direct_io=0
      encryption=none
      evict_max=0
      file_type=row-store
      firstfit=0
      huffman_key=0
      huffman_value=0
      in_memory=0
      insert_pct=6
      internal_key_truncation=1
      internal_page_max=16
      isolation=snapshot
      key_gap=16
      key_max=106
      key_min=23
      leaf_page_max=16
      leak_memory=0
      logging=1
      logging_archive=1
      logging_compression=none
      logging_prealloc=0
      long_running_txn=0
      lsm_worker_threads=3
      merge_max=14
      mmap=1
      modify_pct=79
      ops=100000
      prefix_compression=1
      prefix_compression_min=2
      quiet=1
      read_pct=1
      rebalance=1
      repeat_data_pct=32
      reverse=0
      rows=100000
      runs=100
      salvage=1
      split_pct=59
      statistics=1
      statistics_server=0
      threads=17
      timer=20
      transaction_timestamps=1
      transaction-frequency=49
      value_max=4063
      value_min=19
      verify=1
      wiredtiger_config=
      write_pct=12
      ############################################
      

      From the failure

      We should either make test/format adjust it's timestamp usage if the cache becomes full to move oldest forward and unpin content (hard I suspect), or ensure that auto-generated configurations have a reasonable setting for cache size.

            Assignee:
            keith.bostic@mongodb.com Keith Bostic (Inactive)
            Reporter:
            alexander.gorrod@mongodb.com Alexander Gorrod
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

              Created:
              Updated:
              Resolved: