-
Type:
Build Failure
-
Resolution: Fixed
-
Priority:
Major - P3
-
Affects Version/s: None
-
Component/s: Disagg CI-blocker, Test Format
-
Storage Engines, Storage Engines - Transactions
-
SE Transactions - 2025-10-24
-
8
-
- format-stress-data-validation-test-disagg-leader-1
- format-stress-data-validation-test-disagg-leader-palite-1
- format-stress-data-validation-test-disagg-leader-palite-2
- format-stress-data-validation-test-disagg-leader-palite-3
- format-stress-data-validation-test-disagg-leader-palite-4
- format-stress-data-validation-test-disagg-leader-palite-5
- format-stress-test-disagg-leader-1
- format-stress-test-disagg-leader-2
- format-stress-test-disagg-leader-4
- format-stress-test-disagg-leader-5
- format-stress-test-disagg-leader-palite-2
- format-stress-test-disagg-leader-palite-4
- format-stress-test-disagg-leader-palite-5
-
35
There is an key OOO that only happens with internal page delta on and leaf page delta turned off. The issue reproduces consistently with the CONFIG. Here is the error:
[2025/09/30 14:17:15.687] 0x731a7fc8a8b0:[1759205835:687490][31934:0xffff2b3fdbc0], t, file:T00002.wt_stable, eviction-server: [WT_VERB_EVICTION][WARNING]: Eviction took more than 1 minute (68886900us). Building disk image took 68858332us. History store wrapup took 8us. [2025/09/30 14:18:13.714] 0x731a7fc8af78:[1759205893:714042][31934:0xffff2a9edbc0], t, file:T00002.wt_stable, eviction-server: [WT_VERB_EVICTION][WARNING]: Eviction took more than 1 minute (127332352us). Building disk image took 127302781us. History store wrapup took 9us. [2025/09/30 14:24:57.878] table.1 skipped verify because verify does not support disagg delta pages. table.2 skipped verify because verify does not support disagg delta pages. table.3 skipped verify because verify does not support disagg delta pages. 0x731a7fc8a8b0:[1759206297:878925][31934:0xffff2b3fdbc0], t, file:T00003.wt_stable, eviction-server: [WT_VERB_EVICTION][WARNING]: Eviction took more than 1 minute (79015322us). Building disk image took 55972988us. History store wrapup took 114us. [2025/09/30 14:25:01.085] 0x731a7fc8b640:[1759206301:085414][31934:0xffff29fddbc0], t, file:T00003.wt_stable, eviction-server: [WT_VERB_EVICTION][WARNING]: Eviction took more than 1 minute (81088877us). Building disk image took 77910102us. History store wrapup took 0us. [2025/09/30 14:26:13.583] 0x731a7fc8a8b0:[1759206373:583315][31934:0xffff2b3fdbc0], t, file:T00002.wt_stable, eviction-server: [WT_VERB_EVICTION][WARNING]: Eviction took more than 1 minute (75582398us). Building disk image took 65504122us. History store wrapup took 81us. [2025/09/30 14:43:07.500] 0x731a7fc8af78:[1759207387:500618][31934:0xffff2a9edbc0], t, file:T00003.wt_stable, eviction-server: [WT_VERB_EVICTION][WARNING]: Eviction took more than 1 minute (108747207us). Building disk image took 100716922us. History store wrapup took 54us. [2025/09/30 14:43:11.200] 0x731a7fc8b640:[1759207391:200718][31934:0xffff29fddbc0], t, file:T00003.wt_stable, eviction-server: [WT_VERB_EVICTION][WARNING]: Eviction took more than 1 minute (112348257us). Building disk image took 102176763us. History store wrapup took 57us. [2025/09/30 14:44:58.878] 0x731a7fc8b640:[1759207498:878182][31934:0xffff29fddbc0], t, file:T00003.wt_stable, eviction-server: [WT_VERB_EVICTION][WARNING]: Eviction took more than 1 minute (107677407us). Building disk image took 101738648us. History store wrapup took 41us. [2025/09/30 14:45:04.483] 0x731a7fc8af78:[1759207504:483152][31934:0xffff2a9edbc0], t, file:T00003.wt_stable, eviction-server: [WT_VERB_EVICTION][WARNING]: Eviction took more than 1 minute (116982479us). Building disk image took 100454538us. History store wrapup took 50us. [2025/09/30 14:46:54.353] 0x731a7fc8af78:[1759207614:353338][31934:0xffff2a9edbc0], t, file:T00003.wt_stable, eviction-server: [WT_VERB_EVICTION][WARNING]: Eviction took more than 1 minute (109870240us). Building disk image took 104169772us. History store wrapup took 66us. [2025/09/30 14:46:59.540] 0x731a7fc8b640:[1759207619:540465][31934:0xffff29fddbc0], t, file:T00003.wt_stable, eviction-server: [WT_VERB_EVICTION][WARNING]: Eviction took more than 1 minute (120662339us). Building disk image took 103367389us. History store wrapup took 79us. [2025/09/30 14:47:46.310] 0x731a7fc8a8b0:[1759207666:310393][31934:0xffff2b3fdbc0], t, file:T00003.wt_stable, eviction-server: [WT_VERB_EVICTION][WARNING]: Eviction took more than 1 minute (396134307us). Building disk image took 379494610us. History store wrapup took 65us. [2025/09/30 15:03:13.845] 0x731a7fc8af78:[1759208593:845574][31934:0xffff2a9edbc0], t, file:T00003.wt_stable, eviction-server: [WT_VERB_EVICTION][WARNING]: Eviction took more than 1 minute (128052498us). Building disk image took 121115547us. History store wrapup took 234us. [2025/09/30 15:03:14.567] 0x731a7fc8a8b0:[1759208594:567123][31934:0xffff2b3fdbc0], t, file:T00003.wt_stable, eviction-server: [WT_VERB_EVICTION][WARNING]: Eviction took more than 1 minute (128733974us). Building disk image took 121740388us. History store wrapup took 206us. [2025/09/30 15:29:11.250] table.1 skipped verify because verify does not support disagg delta pages. table.2 skipped verify because verify does not support disagg delta pages. table.3 skipped verify because verify does not support disagg delta pages. t: successful run completed (5081 seconds) [2025/09/30 15:29:11.308] t: process 34885 running [2025/09/30 15:29:13.750] ./t -R $format_args || fail [2025/09/30 15:29:13.750] done [2025/09/30 15:29:13.750] [1759210153:750803][34885:0xffff213fdbc0], t, file:T00003.wt_stable, WT_CURSOR.prev: [WT_VERB_OUT_OF_ORDER][ERROR]: WT_CURSOR.prev out-of-order returns: returned key 0000498447.00/opqrstuvw then key 0000498479.00/opqrstuvwxyz
CONFIG:
############################################ # RUN PARAMETERS: V3 ############################################ assert.read_timestamp=0 background_compact=0 background_compact.free_space_target=80 backup=0 backup.incremental=off backup.incr_granularity=5811 backup.live_restore=0 backup.live_restore_read_size=64 backup.live_restore_threads=0 block_cache=1 block_cache.cache_on_checkpoint=0 block_cache.cache_on_writes=1 block_cache.size=1296 cache=6480 cache.evict_max=2 cache.eviction_dirty_target=0 cache.eviction_dirty_trigger=0 cache.eviction_updates_target=0 cache.eviction_updates_trigger=0 cache.minimum=0 cache.maximum=0 checkpoint=on checkpoint.log_size=44 checkpoint.wait=28 chunk_cache=0 chunk_cache.capacity=4236 chunk_cache.chunk_size=3 chunk_cache.storage_path=off chunk_cache.type=off compact.free_space_target=28 debug.background_compact=0 debug.checkpoint_retention=3 debug.cursor_reposition=0 debug.eviction=0 debug.log_retention=1 debug.realloc_exact=0 debug.realloc_malloc=0 debug.slow_checkpoint=0 debug.table_logging=0 debug.update_restore_evict=0 disagg.page_log=palm disagg.mode=leader disagg.enabled=1 disagg.layered=1 disk.data_extend=0 disk.encryption=none disk.mmap=1 disk.mmap_all=0 eviction.evict_use_softptr=0 file_manager.close_handle_minimum=6 file_manager.close_idle_time=37 file_manager.close_scan_interval=0 format.abort=0 format.independent_thread_rng=1 format.major_timeout=0 import=0 logging=0 logging.compression=none logging.file_max=251364 logging.prealloc=0 logging.remove=0 obsolete_cleanup.method=off obsolete_cleanup.wait=2789 ops.alter=0 ops.compaction=0 ops.hs_cursor=0 ops.pct.modify=0 ops.bound_cursor=1 ops.prepare=0 ops.random_cursor=1 ops.salvage=0 ops.throttle=0 ops.throttle.sleep_us=163780 ops.truncate=0 ops.verify=1 prefetch=0 precise_checkpoint=1 preserve_prepared=0 quiet=1 random.data_seed=2331135 random.extra_seed=2154478 runs.in_memory=0 runs.mirror=1 runs.ops=300000 runs.predictable_replay=0 runs.rows=100000 runs.source=layered runs.tables=2 runs.threads=26 runs.timer=3 runs.type=row-store runs.verify_failure_dump=0 statistics.mode=fast statistics_log.sources=off stress.aggressive_stash_free=0 stress.aggressive_sweep=0 stress.checkpoint=0 stress.checkpoint_evict_page=0 stress.checkpoint_prepare=0 stress.compact_slow=0 stress.evict_reposition=0 stress.failpoint_eviction_split=0 stress.failpoint_hs_delete_key_from_ts=0 stress.failpoint_rec_before_wrapup=0 stress.hs_checkpoint_delay=0 stress.hs_search=0 stress.hs_sweep=0 stress.prefetch_delay=0 stress.prepare_resolution_1=0 stress.sleep_before_read_overflow_onpage=0 stress.split_1=0 stress.split_2=0 stress.split_3=0 stress.split_4=0 stress.split_5=0 stress.split_6=1 stress.split_7=0 stress.split_8=0 tiered_storage.flush_frequency=0 tiered_storage.storage_source=off transaction.implicit=0 transaction.operation_timeout_ms=2000 transaction.timestamps=1 wiredtiger.config=off wiredtiger.rwlock=0 wiredtiger.leak_memory=0 ############################################ # TABLE PARAMETERS: table 1 ############################################ table1.btree.compression=none table1.btree.dictionary=0 table1.btree.internal_key_truncation=1 table1.btree.internal_page_max=13 table1.btree.key_max=41 table1.btree.key_min=31 table1.btree.leaf_page_max=11 table1.btree.memory_page_max=6 table1.btree.prefix_len=0 table1.btree.prefix_compression=1 table1.btree.prefix_compression_min=7 table1.btree.reverse=0 table1.btree.split_pct=79 table1.btree.value_max=2974 table1.btree.value_min=13 table1.disagg.enabled=0 table1.disk.checksum=uncompressed table1.disk.firstfit=1 table1.ops.pareto=0 table1.ops.pareto.skew=19 table1.ops.pct.delete=46 table1.ops.pct.insert=38 table1.ops.pct.read=4 table1.ops.pct.write=12 table1.runs.rows=352397 table1.runs.source=table ############################################ # TABLE PARAMETERS: table 2 ############################################ table2.btree.compression=none table2.btree.dictionary=1 table2.btree.internal_key_truncation=1 table2.btree.internal_page_max=14 table2.btree.key_max=42 table2.btree.key_min=24 table2.btree.leaf_page_max=16 table2.btree.memory_page_max=9 table2.btree.prefix_len=0 table2.btree.prefix_compression=1 table2.btree.prefix_compression_min=4 table2.btree.reverse=0 table2.btree.split_pct=89 table2.btree.value_max=588 table2.btree.value_min=12 table2.disk.checksum=on table2.disk.firstfit=0 table2.ops.pareto=0 table2.ops.pareto.skew=96 table2.ops.pct.delete=71 table2.ops.pct.insert=9 table2.ops.pct.read=19 table2.ops.pct.write=1 table2.runs.rows=352397 ############################################ # TABLE PARAMETERS: table 3 ############################################ table3.btree.compression=none table3.btree.dictionary=1 table3.btree.internal_key_truncation=1 table3.btree.internal_page_max=12 table3.btree.key_max=102 table3.btree.key_min=15 table3.btree.leaf_page_max=14 table3.btree.memory_page_max=1 table3.btree.prefix_len=0 table3.btree.prefix_compression=1 table3.btree.prefix_compression_min=4 table3.btree.reverse=0 table3.btree.split_pct=68 table3.btree.value_max=2094 table3.btree.value_min=18 table3.disk.checksum=uncompressed table3.disk.firstfit=1 table3.ops.pareto=0 table3.ops.pareto.skew=19 table3.ops.pct.delete=8 table3.ops.pct.insert=4 table3.ops.pct.read=47 table3.ops.pct.write=41 table3.runs.mirror=0 table3.runs.rows=352397
- action item
-
WT-15737 Add debugging information for page delta on disk
-
- Closed
-
- is duplicated by
-
WT-15724 (disagg.mode=leader) test/format key OOO
-
- Closed
-
-
WT-15725 (disagg.mode=leader) test/format key mismatch error
-
- Closed
-
- is related to
-
WT-15432 Coverity analysis defect 176691: Invalid type in argument to printf format specifier
-
- Closed
-
-
WT-15678 Memory leak in __wti_block_disagg_write_internal
-
- Closed
-
-
WT-15697 Change all python layered tests to use a configurable PALI
-
- Closed
-
-
WT-15589 Verify each read we do is behind the page materialisation frontier
-
- In Code Review
-
-
WT-15718 Favour internal SQLite3 library during build
-
- Closed
-
-
WT-15720 Handle stepdowns in test/format by restarting WT before stepping down
-
- Closed
-
-
WT-15734 Consider removing the layered manager thread
-
- Closed
-
-
WT-15736 Split "maximum page size seen at eviction" to 3 seperate statistics
-
- Closed
-
-
WT-15758 Retry forever on read failures in disagg block manager
-
- Closed
-
-
WT-15682 Increase the read timestamp lag to avoid conflict with prepare timestamp
-
- Closed
-
- related to
-
WT-15215 Investigate all related test/format (disagg) detects out-of-order keys
-
- Closed
-
-
WT-15560 Assertion fail - test/format with precise checkpoint
-
- Closed
-
-
WT-15246 Re-enable prepared testing that depends on new delta data format
-
- Closed
-
-
WT-15522 Configure the error log from cmake & determine the default value
-
- Closed
-
-
WT-15621 Fix AddressSanitizer: heap-buffer-overflow when packing the internal page delta
-
- Closed
-
-
WT-15725 (disagg.mode=leader) test/format key mismatch error
-
- Closed
-