(disagg.mode=leader) test/checkpoint cache stuck

XMLWordPrintableJSON

    • Type: Build Failure
    • Resolution: Duplicate
    • Priority: Major - P3
    • None
    • Affects Version/s: None
    • Component/s: Test Format

      checkpoint-test on amazon2023-disagg-msan-stress

      Host: i-0d543c07edcf9a306
      Project: wiredtiger
      Commit: c29133a7
      Please refer to BF(G) Playbook for instructions on handling BF and BFG tickets as well as Auto-Resolution Rules

      Task Logs:

      checkpoint-test task_log

      Logs:

      3/3 Test #27: test_checkpoint_disagg_leader_row_sweep_timestamps ..........Subprocess aborted***Exception: 3585.53 sec
      

      logs

      checkpoint-test task_log

      Logs:

      [1759981271:160634][17677:0xfffe757dd140], test_checkpoint, eviction-server: [WT_VERB_DEFAULT][ERROR]: int __evict_server(WT_SESSION_IMPL *, _Bool *), 541: Cache stuck for too long, giving up: Connection timed out
      
      

      logs

      checkpoint-test task_log

      Logs:

      transaction state dump
      current ID: 34173
      last running ID: 34173
      metadata_pinned ID: 29902
      oldest ID: 34173
      durable timestamp: (0, 75803)
      oldest timestamp: (0, 25225)
      pinned timestamp: (0, 25225)
      stable timestamp: (0, 75802)
      has_durable_timestamp: yes
      has_oldest_timestamp: yes
      has_pinned_timestamp: yes
      has_stable_timestamp: yes
      oldest_is_pinned: yes
      stable_is_pinned: no
      checkpoint running: yes
      checkpoint generation: 5
      checkpoint pinned ID: 29902
      checkpoint txn ID: 29902
      session count: 24
      Transaction state of active sessions:
      =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
      cache dump
      cache full: no
      cache clean check: no (4.368%)
      cache dirty check: no (3.575%)
      cache updates check: yes (2.500%)
      file:__wt0002.wt_stable(<live>):
      internal: 113 pages, 706.08 KB, 36/77 clean/dirty pages, 201.65/504.42 clean / dirty KB, 7.36 KB max page, 7.36 KB max dirty page
      leaf: 2839 pages, 22216.10 KB, 84/2755 clean/dirty pages, 94.23 /22121.87 /18951.41 clean/dirty/updates KB, 21.57 KB max page, 21.57 KB max dirty page
      file:__wt0001.wt_stable(<live>):
      internal: 154 pages, 1112.11 KB, 154/0 clean/dirty pages, 1112.11/0.00 clean / dirty KB, 11.23 KB max page, 0.00 KB max dirty page
      leaf: 5328 pages, 8520.19 KB, 1785/3543 clean/dirty pages, 1798.54 /6721.65 /2883.45 clean/dirty/updates KB, 4.16 KB max page, 4.16 KB max dirty page
      file:__wt0000.wt_stable(<live>):
      internal: 150 pages, 1105.92 KB, 9/141 clean/dirty pages, 54.38/1051.55 clean / dirty KB, 10.85 KB max page, 10.85 KB max dirty page
      leaf: 5083 pages, 7828.34 KB, 1957/3126 clean/dirty pages, 1995.88 /5832.46 /2433.83 clean/dirty/updates KB, 3.73 KB max page, 3.73 KB max dirty page
      file:__wt0002.wt_ingest(<live>) eviction disabled at open:
      internal: 1 pages, 0.50 KB, 1/0 clean/dirty pages, 0.50/0.00 clean / dirty KB, 0.50 KB max page, 0.00 KB max dirty page
      leaf: 1 pages, 0.32 KB, 1/0 clean/dirty pages, 0.32 /0.00 /0.00 clean/dirty/updates KB, 0.32 KB max page, 0.00 KB max dirty page
      file:__wt0001.wt_ingest(<live>) eviction disabled at open:
      internal: 1 pages, 0.50 KB, 1/0 clean/dirty pages, 0.50/0.00 clean / dirty KB, 0.50 KB max page, 0.00 KB max dirty page
      leaf: 1 pages, 0.32 KB, 1/0 clean/dirty pages, 0.32 /0.00 /0.00 clean/dirty/updates KB, 0.32 KB max page, 0.00 KB max dirty page
      file:__wt0000.wt_ingest(<live>) eviction disabled at open:
      internal: 1 pages, 0.50 KB, 1/0 clean/dirty pages, 0.50/0.00 clean / dirty KB, 0.50 KB max page, 0.00 KB max dirty page
      leaf: 1 pages, 0.32 KB, 1/0 clean/dirty pages, 0.32 /0.00 /0.00 clean/dirty/updates KB, 0.32 KB max page, 0.00 KB max dirty page
      file:WiredTigerSharedHS.wt_stable(<live>):
      internal: 1 pages, 10.37 KB, 0/1 clean/dirty pages, 0.00/10.37 clean / dirty KB, 10.37 KB max page, 10.37 KB max dirty page
      leaf: 54 pages, 876.59 KB, 54/0 clean/dirty pages, 876.59 /0.00 /0.00 clean/dirty/updates KB, 25.32 KB max page, 0.00 KB max dirty page
      file:WiredTigerShared.wt_stable(<live>):
      internal: 1 pages, 0.89 KB, 0/1 clean/dirty pages, 0.00/0.89 clean / dirty KB, 0.89 KB max page, 0.89 KB max dirty page
      leaf: 1 pages, 13.67 KB, 0/1 clean/dirty pages, 0.00 /13.67 /5.24 clean/dirty/updates KB, 13.67 KB max page, 13.67 KB max dirty page
      file:WiredTigerHS.wt(<live>) eviction disabled at open:
      internal: 1 pages, 0.40 KB, 1/0 clean/dirty pages, 0.40/0.00 clean / dirty KB, 0.40 KB max page, 0.00 KB max dirty page
      leaf: 0 pages
      file:WiredTiger.wt(<live>):
      internal: 1 pages, 0.77 KB, 0/1 clean/dirty pages, 0.00/0.77 clean / dirty KB, 0.77 KB max page, 0.77 KB max dirty page
      leaf: 1 pages, 18.39 KB, 0/1 clean/dirty pages, 0.00 /18.39 /3.44 clean/dirty/updates KB, 18.39 KB max page, 18.39 KB max dirty page
      cache dump: total found: 44.73 MB vs tracked inuse 41.42 MB
      total dirty bytes: 35.43 MB vs tracked dirty 35.43 MB
      

      logs

      checkpoint-test task_log

      Logs:

      [1759981271:208372][17677:0xfffe757dd140], test_checkpoint, eviction-server: [WT_VERB_ERROR_RETURNS][ERROR]: int __wt_btcur_next(WT_CURSOR_BTREE *, _Bool), 955: Error at src/btree/bt_curnext.c:955: "WT_NOTFOUND" failed: WT_NOTFOUND: item not found
      [1759981271:208395][17677:0xfffe757dd140], test_checkpoint, eviction-server: [WT_VERB_ERROR_RETURNS][ERROR]: int __curfile_next(WT_CURSOR *), 186: Error at src/cursor/cur_file.c:186: "ret" failed: WT_NOTFOUND: item not found
      [1759981271:208406][17677:0xfffe757dd140], test_checkpoint, eviction-server: [WT_VERB_ERROR_RETURNS][ERROR]: int __evict_thread_run(WT_SESSION_IMPL *, WT_THREAD *), 335: Error at src/evict/evict_lru.c:335: "ret" failed: Connection timed out
      [1759981271:208418][17677:0xfffe757dd140], test_checkpoint, eviction-server: [WT_VERB_DEFAULT][ERROR]: int __evict_thread_run(WT_SESSION_IMPL *, WT_THREAD *), 358: eviction thread error: Connection timed out
      [1759981271:208427][17677:0xfffe757dd140], test_checkpoint, eviction-server: [WT_VERB_DEFAULT][ERROR]: int __evict_thread_run(WT_SESSION_IMPL *, WT_THREAD *), 358: the process must exit and restart: WT_PANIC: WiredTiger library panic
      [1759981271:208438][17677:0xfffe757dd140], test_checkpoint, eviction-server: [WT_VERB_DEFAULT][ERROR]: void __wt_abort(WT_SESSION_IMPL *), 29: aborting WiredTiger library
      

      logs

      checkpoint-test task_log

      Logs:

      The following tests FAILED:
      	 27 - test_checkpoint_disagg_leader_row_sweep_timestamps (Subprocess aborted)
      

      logs

      checkpoint-test task_log

      Logs:

      #0  0x0000ffff8006e7b4 in __pthread_kill_implementation () from /lib64/libc.so.6
      #0  0x0000ffff8006e7b4 in __pthread_kill_implementation () from /lib64/libc.so.6
      #1  0x0000ffff800253a0 [PAC] in raise () from /lib64/libc.so.6
      #2  0x0000ffff80011264 [PAC] in abort () from /lib64/libc.so.6
      #3  0x0000ffff81788268 [PAC] in __wt_abort (session=0xffff7f8aed08) at /data/mci/96de5eb187bccab722c2396dfcb314b2/wiredtiger/src/os_common/os_abort.c:32
      #4  0x0000ffff81e9ae60 in __wt_panic_func (session=0xffff7f8aed08, error=110, func=0xffff82276898 "int __evict_thread_run(WT_SESSION_IMPL *, WT_THREAD *)", line=358, category=WT_VERB_DEFAULT, fmt=0xffff82276901 "eviction thread error") at /data/mci/96de5eb187bccab722c2396dfcb314b2/wiredtiger/src/support/err.c:611
      #5  0x0000ffff81415a9c in __evict_thread_run (session=0xffff7f8aed08, thread=0xe05000000410) at /data/mci/96de5eb187bccab722c2396dfcb314b2/wiredtiger/src/evict/evict_lru.c:358
      #6  0x0000ffff820bc400 in __thread_run (arg=0xe05000000410) at /data/mci/96de5eb187bccab722c2396dfcb314b2/wiredtiger/src/support/thread_group.c:32
      #7  0x0000ffff8006cb78 in start_thread () from /lib64/libc.so.6
      #8  0x0000ffff800d9cdc [PAC] in thread_start () from /lib64/libc.so.6
      

      logs

            Assignee:
            [DO NOT USE] Backlog - Storage Engines Team
            Reporter:
            xgen-buildbaron-user
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

              Created:
              Updated:
              Resolved: