[test/format] Cache stuck: dirty eviction (21.9%)

    • Type: Build Failure
    • Resolution: Unresolved
    • Priority: Major - P3
    • None
    • Affects Version/s: None
    • Component/s: None

      format-stress-test-3 on amazon2023-stress-tests-arm64

      Host: i-01d1dfe89f633c09d
      Project: wiredtiger
      Commit: 06aef1d1
      Please refer to BF(G) Playbook for instructions on handling BF and BFG tickets as well as Auto-Resolution Rules

      Task Logs:

      format-stress-test-3 task_log

      Logs:

      [392/663] Building C object test/csuite/CMakeFiles/test_wt16990_disagg_checkpoint_panic.dir/wt16990_disagg_checkpoint_panic/main.c.o
      [412/663] Linking CXX executable test/csuite/wt16990_disagg_checkpoint_panic/test_wt16990_disagg_checkpoint_panic
      

      logs

      format-stress-test-3 task_log

      Logs:

          [1778764670:838384][151834:0xffff9496dbc0], t, file:WiredTigerHS.wt, eviction-server: [WT_VERB_DEFAULT][ERROR]: __evict_server, 278: Cache stuck for too long, giving up: Connection timed out
      
      

      logs

      format-stress-test-3 task_log

      Logs:

          0x7260bfeafe80:transaction state dump
          0x7260bfeafe80:current ID: 2322281
          0x7260bfeafe80:last running ID: 2322281
          0x7260bfeafe80:metadata_pinned ID: 2322281
          0x7260bfeafe80:oldest ID: 2322281
          0x7260bfeafe80:durable timestamp: (0, 1922704)
          0x7260bfeafe80:oldest timestamp: (0, 1916285)
          0x7260bfeafe80:pinned timestamp: (0, 1916285)
          0x7260bfeafe80:stable timestamp: (0, 1919951)
          0x7260bfeafe80:stable disaggregated schema epoch: (0, 0)
          0x7260bfeafe80:has_durable_timestamp: yes
          0x7260bfeafe80:has_oldest_timestamp: yes
          0x7260bfeafe80:has_pinned_timestamp: yes
          0x7260bfeafe80:has_stable_timestamp: yes
          0x7260bfeafe80:has_stable_disaggregated_schema_epoch: no
          0x7260bfeafe80:oldest_is_pinned: yes
          0x7260bfeafe80:stable_is_pinned: no
          0x7260bfeafe80:checkpoint running: no
          0x7260bfeafe80:checkpoint generation: 9
          0x7260bfeafe80:checkpoint pinned ID: 0
          0x7260bfeafe80:checkpoint txn ID: 0
          0x7260bfeafe80:session count: 17
          0x7260bfeafe80:Transaction state of active sessions:
          0x7260bfeafe80:session ID: 16, txn ID: 0, pinned ID: 2322281, metadata pinned ID: 0, name: WT_CURSOR.search
          0x7260bfeafe80:transaction id: 0, mod count: 0, snap min: 2322281, snap max: 2322281, snapshot count: 0, snapshot: [], commit_timestamp: (0, 0), durable_timestamp: (0, 0), first_commit_timestamp: (0, 0), prepare_timestamp: (0, 0), prepared id: 0, pinned_durable_timestamp: (0, 0), read_timestamp: (0, 1919951), checkpoint LSN: [0,0], full checkpoint: false, flags: 0x00000a04, isolation: WT_ISO_SNAPSHOT, last saved error code: 0, last saved sub-level error code: -32000, last saved error message: last API call was successful
          0x7260bfeafe80:=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
          0x7260bfeafe80:cache dump
          0x7260bfeafe80:cache full: no
          0x7260bfeafe80:cache clean check: no (21.873%)
          0x7260bfeafe80:cache dirty check: yes (21.857%)
          0x7260bfeafe80:cache updates check: no (5.479%)
          0x7260bfeafe80:file:T00003.wt(<live>):
          0x7260bfeafe80:internal: 39 pages, 133.12 KB, 3/36 clean/dirty pages, 8.55/124.57 clean / dirty KB, 4.16 KB max page, 3.64 KB max dirty page
          0x7260bfeafe80:leaf: 553 pages, 183115.83 KB, 0/553 clean/dirty pages, 0.00 /183115.83 /52262.64 clean/dirty/updates KB, 770.91 KB max page, 770.91 KB max dirty page
          0x7260bfeafe80:file:T00002.wt(<live>):
          0x7260bfeafe80:internal: 1 pages, 58.85 KB, 0/1 clean/dirty pages, 0.00/58.85 clean / dirty KB, 58.85 KB max page, 58.85 KB max dirty page
          0x7260bfeafe80:leaf: 324 pages, 186475.30 KB, 0/324 clean/dirty pages, 0.00 /186475.30 /58467.61 clean/dirty/updates KB, 827.71 KB max page, 827.71 KB max dirty page
          0x7260bfeafe80:file:T00001.wt(<live>):
          0x7260bfeafe80:internal: 29 pages, 270.06 KB, 1/28 clean/dirty pages, 6.44/263.62 clean / dirty KB, 16.50 KB max page, 16.50 KB max dirty page
          0x7260bfeafe80:leaf: 945 pages, 265733.74 KB, 1/944 clean/dirty pages, 0.34 /265733.40 /48755.91 clean/dirty/updates KB, 742.90 KB max page, 742.90 KB max dirty page
          0x7260bfeafe80:file:WiredTigerHS.wt(<live>):
          0x7260bfeafe80:internal: 1 pages, 0.90 KB, 0/1 clean/dirty pages, 0.00/0.90 clean / dirty KB, 0.90 KB max page, 0.90 KB max dirty page
          0x7260bfeafe80:leaf: 0 pages
          0x7260bfeafe80:file:WiredTiger.wt(<live>):
          0x7260bfeafe80:internal: 1 pages, 0.77 KB, 1/0 clean/dirty pages, 0.77/0.00 clean / dirty KB, 0.77 KB max page, 0.00 KB max dirty page
          0x7260bfeafe80:leaf: 1 pages, 22.12 KB, 1/0 clean/dirty pages, 22.12 /0.00 /14.09 clean/dirty/updates KB, 22.12 KB max page, 0.00 KB max dirty page
          0x7260bfeafe80:cache dump: total found: 670.58 MB vs tracked inuse 622.18 MB
          0x7260bfeafe80:total dirty bytes: 620.87 MB vs tracked dirty 622.14 MB
      

      logs

      format-stress-test-3 task_log

      Logs:

          [1778764670:839325][151834:0xffff9496dbc0], t, file:WiredTigerHS.wt, eviction-server: [WT_VERB_ERROR_RETURNS][ERROR]: __wt_btcur_next, 795: Error at src/btree/bt_curnext.c:795: "WT_NOTFOUND" failed: WT_NOTFOUND: item not found
          [1778764670:839331][151834:0xffff9496dbc0], t, file:WiredTigerHS.wt, eviction-server: [WT_VERB_ERROR_RETURNS][ERROR]: __curfile_next, 186: Error at src/cursor/cur_file.c:186: "ret" failed: WT_NOTFOUND: item not found
          [1778764670:839336][151834:0xffff9496dbc0], t, file:WiredTigerHS.wt, eviction-server: [WT_VERB_ERROR_RETURNS][ERROR]: __evict_thread_run, 98: Error at src/evict/evict_thread.c:98: "ret" failed: Connection timed out
          [1778764670:839340][151834:0xffff9496dbc0], t, file:WiredTigerHS.wt, eviction-server: [WT_VERB_DEFAULT][ERROR]: __evict_thread_run, 121: eviction thread error: Connection timed out
          [1778764670:839342][151834:0xffff9496dbc0], t, file:WiredTigerHS.wt, eviction-server: [WT_VERB_DEFAULT][ERROR]: __evict_thread_run, 121: the process must exit and restart: WT_PANIC: WiredTiger library panic
          [1778764670:839346][151834:0xffff9496dbc0], t, file:WiredTigerHS.wt, eviction-server: [WT_VERB_DEFAULT][ERROR]: __wt_abort, 29: aborting WiredTiger library
      

      logs

      format-stress-test-3 task_log

      Logs:

          /data/mci/dc7f27e5b6f808d2d5253a03a9d2bc7b/wiredtiger/cmake_build/test/format/../../libwiredtiger.so.12.0.0(__wt_panic_func+0x194)[0xffff99af4b74]
      
      

      logs

      format-stress-test-3 task_log

      Logs:

      test/format run configuration highlights
      

      logs

      format-stress-test-3 task_log

      Logs:

      #0  0x0000ffff996c4454 in __pthread_kill_implementation () from /lib64/libc.so.6
      #0  0x0000ffff996c4454 in __pthread_kill_implementation () from /lib64/libc.so.6
      #1  0x0000ffff9967b320 [PAC] in raise () from /lib64/libc.so.6
      #2  0x0000ffff99662224 [PAC] in abort () from /lib64/libc.so.6
      #3  0x0000ffff99a4a91c [PAC] in __wt_abort (session=session@entry=0x7260bfeafe80) at /data/mci/dc7f27e5b6f808d2d5253a03a9d2bc7b/wiredtiger/src/os_common/os_abort.c:32
      #4  0x0000ffff99af4b74 in __wt_panic_func (session=session@entry=0x7260bfeafe80, error=error@entry=110, func=func@entry=0xffff99bf5120 <__PRETTY_FUNCTION__.6> "__evict_thread_run", line=line@entry=121, category=category@entry=WT_VERB_DEFAULT, fmt=fmt@entry=0xffff99b91a40 "eviction thread error") at /data/mci/dc7f27e5b6f808d2d5253a03a9d2bc7b/wiredtiger/src/support/err.c:633
      #5  0x0000ffff99a02fd4 in __evict_thread_run (session=0x7260bfeafe80, thread=0x7260bfc00f50) at /data/mci/dc7f27e5b6f808d2d5253a03a9d2bc7b/wiredtiger/src/evict/evict_thread.c:121
      #6  0x0000ffff99b10638 in __thread_run (arg=0x7260bfc00f50) at /data/mci/dc7f27e5b6f808d2d5253a03a9d2bc7b/wiredtiger/src/support/thread_group.c:32
      #7  0x0000ffff996c2834 in start_thread () from /lib64/libc.so.6
      #8  0x0000ffff99666e5c [PAC] in thread_start () from /lib64/libc.so.6
      

      logs

      Repro Artifacts:

            Assignee:
            [DO NOT USE] Backlog - Storage Engines Team
            Reporter:
            xgen-buildbaron-user
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

              Created:
              Updated: