test/format (disagg.mode=leader, PALM) cache stuck invisible update restoration

XMLWordPrintableJSON

      format-stress-data-validation-test-disagg-leader-2 on amazon2023-disagg-stress

      Host: i-06f0b8f2e7cdf0889
      Project: wiredtiger
      Commit: 50db84cd
      Please refer to BF(G) Playbook for instructions on handling BF and BFG tickets as well as Auto-Resolution Rules

      Task Logs:

      format-stress-data-validation-test-disagg-leader-2 task_log

      Logs:

      0x11aebfc91640:transaction state dump
      0x11aebfc91640:current ID: 432190
      0x11aebfc91640:last running ID: 432190
      0x11aebfc91640:metadata_pinned ID: 339660
      0x11aebfc91640:oldest ID: 432190
      0x11aebfc91640:durable timestamp: (0, 649263)
      0x11aebfc91640:oldest timestamp: (0, 635386)
      0x11aebfc91640:pinned timestamp: (0, 635386)
      0x11aebfc91640:stable timestamp: (0, 635386)
      0x11aebfc91640:has_durable_timestamp: yes
      0x11aebfc91640:has_oldest_timestamp: yes
      0x11aebfc91640:has_pinned_timestamp: yes
      0x11aebfc91640:has_stable_timestamp: yes
      0x11aebfc91640:oldest_is_pinned: yes
      0x11aebfc91640:stable_is_pinned: yes
      0x11aebfc91640:checkpoint running: yes
      0x11aebfc91640:checkpoint generation: 5
      0x11aebfc91640:checkpoint pinned ID: 339643
      0x11aebfc91640:checkpoint txn ID: 339660
      0x11aebfc91640:session count: 33
      0x11aebfc91640:Transaction state of active sessions:
      0x11aebfc91640:=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
      0x11aebfc91640:cache dump
      0x11aebfc91640:cache full: no
      0x11aebfc91640:cache clean check: no (13.686%)
      0x11aebfc91640:cache dirty check: no (13.004%)
      0x11aebfc91640:cache updates check: yes (10.000%)
      0x11aebfc91640:file:T00002.wt_ingest(<live>) eviction disabled at open:
      0x11aebfc91640:internal: 1 pages, 0.50 KB, 1/0 clean/dirty pages, 0.50/0.00 clean / dirty KB, 0.50 KB max page, 0.00 KB max dirty page
      0x11aebfc91640:leaf: 1 pages, 0.32 KB, 1/0 clean/dirty pages, 0.32 /0.00 /0.00 clean/dirty/updates KB, 0.32 KB max page, 0.00 KB max dirty page
      0x11aebfc91640:file:T00003.wt_ingest(<live>) eviction disabled at open:
      0x11aebfc91640:internal: 1 pages, 0.50 KB, 1/0 clean/dirty pages, 0.50/0.00 clean / dirty KB, 0.50 KB max page, 0.00 KB max dirty page
      0x11aebfc91640:leaf: 1 pages, 0.32 KB, 1/0 clean/dirty pages, 0.32 /0.00 /0.00 clean/dirty/updates KB, 0.32 KB max page, 0.00 KB max dirty page
      0x11aebfc91640:file:T00001.wt(<live>):
      t: WARNING: table.1 skipped verify because of EBUSY
      [1762653113:791244][6310:0xffff381adbc0], t, file:WiredTigerSharedHS.wt_stable, eviction-server: [WT_VERB_DEFAULT][ERROR]: __evict_server, 543: Cache stuck for too long, giving up: Connection timed out
      0x11aebfc91640:internal: 966 pages, 3822.18 KB, 196/770 clean/dirty pages, 693.97/3128.20 clean / dirty KB, 5.65 KB max page, 5.65 KB max dirty page
      0x11aebfc91640:leaf: 16942 pages, 94655.52 KB, 212/16730 clean/dirty pages, 165.93 /94489.59 /80287.25 clean/dirty/updates KB, 1731.63 KB max page, 1731.63 KB max dirty page
      0x11aebfc91640:file:WiredTigerShared.wt_stable(<live>):
      0x11aebfc91640:internal: 1 pages, 0.88 KB, 1/0 clean/dirty pages, 0.88/0.00 clean / dirty KB, 0.88 KB max page, 0.00 KB max dirty page
      0x11aebfc91640:leaf: 0 pages
      0x11aebfc91640:file:T00003.wt_stable(<live>):
      0x11aebfc91640:internal: 255 pages, 4839.77 KB, 12/243 clean/dirty pages, 227.14/4612.63 clean / dirty KB, 100.84 KB max page, 100.84 KB max dirty page
      0x11aebfc91640:leaf: 25860 pages, 87967.47 KB, 502/25358 clean/dirty pages, 656.50 /87310.97 /61718.97 clean/dirty/updates KB, 1024.15 KB max page, 1024.15 KB max dirty page
      0x11aebfc91640:file:T00002.wt_stable(<live>):
      0x11aebfc91640:internal: 101 pages, 2103.93 KB, 0/101 clean/dirty pages, 0.00/2103.93 clean / dirty KB, 25.61 KB max page, 25.61 KB max dirty page
      0x11aebfc91640:leaf: 11489 pages, 72286.81 KB, 654/10835 clean/dirty pages, 1639.78 /70647.03 /52139.91 clean/dirty/updates KB, 4441.06 KB max page, 4441.06 KB max dirty page
      0x11aebfc91640:file:WiredTigerSharedHS.wt_stable(<live>):
      0x11aebfc91640:internal: 1 pages, 0.83 KB, 1/0 clean/dirty pages, 0.83/0.00 clean / dirty KB, 0.83 KB max page, 0.00 KB max dirty page
      0x11aebfc91640:leaf: 0 pages
      0x11aebfc91640:file:WiredTigerHS.wt(<live>) eviction disabled at open:
      0x11aebfc91640:internal: 1 pages, 0.40 KB, 1/0 clean/dirty pages, 0.40/0.00 clean / dirty KB, 0.40 KB max page, 0.00 KB max dirty page
      0x11aebfc91640:leaf: 1 pages, 0.22 KB, 1/0 clean/dirty pages, 0.22 /0.00 /0.00 clean/dirty/updates KB, 0.22 KB max page, 0.00 KB max dirty page
      0x11aebfc91640:file:WiredTiger.wt(<live>):
      0x11aebfc91640:internal: 1 pages, 0.77 KB, 1/0 clean/dirty pages, 0.77/0.00 clean / dirty KB, 0.77 KB max page, 0.00 KB max dirty page
      0x11aebfc91640:leaf: 1 pages, 28.62 KB, 0/1 clean/dirty pages, 0.00 /28.62 /15.42 clean/dirty/updates KB, 28.62 KB max page, 28.62 KB max dirty page
      0x11aebfc91640:cache dump: total found: 280.24 MB vs tracked inuse 259.52 MB
      0x11aebfc91640:total dirty bytes: 256.17 MB vs tracked dirty 256.21 MB
      

      logs

      format-stress-data-validation-test-disagg-leader-2 task_log

      Logs:

      [1762653113:809385][6310:0xffff381adbc0], t, file:WiredTigerSharedHS.wt_stable, eviction-server: [WT_VERB_ERROR_RETURNS][ERROR]: __wt_btcur_next, 957: Error at src/btree/bt_curnext.c:957: "WT_NOTFOUND" failed: WT_NOTFOUND: item not found
      [1762653113:809392][6310:0xffff381adbc0], t, file:WiredTigerSharedHS.wt_stable, eviction-server: [WT_VERB_ERROR_RETURNS][ERROR]: __curfile_next, 186: Error at src/cursor/cur_file.c:186: "ret" failed: WT_NOTFOUND: item not found
      [1762653113:809395][6310:0xffff381adbc0], t, file:WiredTigerSharedHS.wt_stable, eviction-server: [WT_VERB_ERROR_RETURNS][ERROR]: __evict_thread_run, 336: Error at src/evict/evict_lru.c:336: "ret" failed: Connection timed out
      [1762653113:809398][6310:0xffff381adbc0], t, file:WiredTigerSharedHS.wt_stable, eviction-server: [WT_VERB_DEFAULT][ERROR]: __evict_thread_run, 359: eviction thread error: Connection timed out
      [1762653113:809400][6310:0xffff381adbc0], t, file:WiredTigerSharedHS.wt_stable, eviction-server: [WT_VERB_DEFAULT][ERROR]: __evict_thread_run, 359: the process must exit and restart: WT_PANIC: WiredTiger library panic
      [1762653113:809403][6310:0xffff381adbc0], t, file:WiredTigerSharedHS.wt_stable, eviction-server: [WT_VERB_DEFAULT][ERROR]: __wt_abort, 29: aborting WiredTiger library
      

      logs

      format-stress-data-validation-test-disagg-leader-2 task_log

      Logs:

      #0  0x0000ffffbb8bf7b4 in __pthread_kill_implementation () from /lib64/libc.so.6
      #0  0x0000ffffbb8bf7b4 in __pthread_kill_implementation () from /lib64/libc.so.6
      #1  0x0000ffffbb8763a0 [PAC] in raise () from /lib64/libc.so.6
      #2  0x0000ffffbb862264 [PAC] in abort () from /lib64/libc.so.6
      #3  0x0000ffffbbc2ad94 [PAC] in __wt_abort (session=session@entry=0x11aebfc91640) at /data/mci/673b82bb2798db54d207545b936ce3df/wiredtiger/src/os_common/os_abort.c:32
      #4  0x0000ffffbbccfbf0 in __wt_panic_func (session=session@entry=0x11aebfc91640, error=error@entry=110, func=func@entry=0xffffbbdbf1f0 <__PRETTY_FUNCTION__.30> "__evict_thread_run", line=line@entry=359, category=category@entry=WT_VERB_DEFAULT, fmt=fmt@entry=0xffffbbd5f940 "eviction thread error") at /data/mci/673b82bb2798db54d207545b936ce3df/wiredtiger/src/support/err.c:611
      #5  0x0000ffffbbbe12b4 in __evict_thread_run (session=0x11aebfc91640, thread=0x11aebfe067d0) at /data/mci/673b82bb2798db54d207545b936ce3df/wiredtiger/src/evict/evict_lru.c:359
      #6  0x0000ffffbbcea4c0 in __thread_run (arg=0x11aebfe067d0) at /data/mci/673b82bb2798db54d207545b936ce3df/wiredtiger/src/support/thread_group.c:32
      #7  0x0000ffffbb8bdb78 in start_thread () from /lib64/libc.so.6
      #8  0x0000ffffbb92acdc [PAC] in thread_start () from /lib64/libc.so.6
      

      logs

      Repro Artifacts:

            Assignee:
            [DO NOT USE] Backlog - Storage Engines Team
            Reporter:
            xgen-buildbaron-user
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Created:
              Updated: