-
Type:
Bug
-
Resolution: Fixed
-
Priority:
Major - P3
-
Affects Version/s: None
-
Component/s: Test Python
-
Storage Engines, Storage Engines - Transactions
-
SE Transactions - 2025-09-26, SE Transactions - 2025-10-10
-
5
Not sure if rollback_to_stable is covered by Persistence or Transactions, feel free to reassign. I think rollback_to_stable is available in disagg (??). If that's not true, let's just disable these tests, via explicit use of @wttest.skip_for_hook("disagg", "reason...").
Running any of these tests under the disagg hook gives a crash, for example:
gdb --args python3 ../test/suite/run.py -v 2 --hook disagg test_bug033.py Thread 1 "python3" received signal SIGABRT, Aborted. 0x0000fffff769f200 in ?? () from /lib/aarch64-linux-gnu/libc.so.6 (gdb) bt #0 0x0000fffff769f200 in ?? () from /lib/aarch64-linux-gnu/libc.so.6 #1 0x0000fffff765a67c in raise () from /lib/aarch64-linux-gnu/libc.so.6 #2 0x0000fffff7647130 in abort () from /lib/aarch64-linux-gnu/libc.so.6 #3 0x0000fffff6bdaeac in __wt_abort (session=0x8f81c0) at /home/dda/wt/git/wt-15262-python-tests-stability-triage/src/os_common/os_abort.c:31 #4 0x0000fffff6a27aa8 in __wt_tree_modify_set (session=0x8f81c0) at /home/dda/wt/git/wt-15262-python-tests-stability-triage/src/include/btree_inline.h:797 #5 0x0000fffff6a29fe8 in __wt_sync_file (session=0x8f81c0, syncop=WT_SYNC_CHECKPOINT) at /home/dda/wt/git/wt-15262-python-tests-stability-triage/src/btree/bt_sync.c:387 #6 0x0000fffff6a8611c in __checkpoint_tree (session=0x8f81c0, is_checkpoint=true, cfg=0xffffffffcaa0) at /home/dda/wt/git/wt-15262-python-tests-stability-triage/src/checkpoint/checkpoint_txn.c:2547 #7 0x0000fffff6a86c04 in __wt_checkpoint_file (session=0x8f81c0, cfg=0xffffffffcaa0) at /home/dda/wt/git/wt-15262-python-tests-stability-triage/src/checkpoint/checkpoint_txn.c:2742 #8 0x0000fffff6a82fb4 in __checkpoint_db_internal (session=0x8f81c0, cfg=0xffffffffcaa0) at /home/dda/wt/git/wt-15262-python-tests-stability-triage/src/checkpoint/checkpoint_txn.c:1548 #9 0x0000fffff6a83b30 in __checkpoint_db_wrapper (session=0x8f81c0, cfg=0xffffffffcaa0) at /home/dda/wt/git/wt-15262-python-tests-stability-triage/src/checkpoint/checkpoint_txn.c:1721 #10 0x0000fffff6a83dc8 in __wt_checkpoint_db (session=0x8f81c0, cfg=0xffffffffcaa0, waiting=true) at /home/dda/wt/git/wt-15262-python-tests-stability-triage/src/checkpoint/checkpoint_txn.c:1800 #11 0x0000fffff6ca1aa4 in __session_checkpoint (wt_session=0x8f81c0, config=0xfffff6e00e00 "force=1") at /home/dda/wt/git/wt-15262-python-tests-stability-triage/src/session/session_api.c:2363 #12 0x0000fffff6cafa4c in __rollback_to_stable_int (session=0x8f81c0, no_ckpt=false) at /home/dda/wt/git/wt-15262-python-tests-stability-triage/src/rollback_to_stable/rts_api.c:188 #13 0x0000fffff6cb0444 in __rollback_to_stable (session=0x8f81c0, cfg=0xffffffbugffd8b0, no_ckpt=false) at /home/dda/wt/git/wt-15262-python-tests-stability-triage/src/rollback_to_stable/rts_api.c:312 #14 0x0000fffff6aa58d4 in __conn_rollback_to_stable (wt_conn=0x7d7970, config=0xffffffffd8c8 "threads=4") at /home/dda/wt/git/wt-15262-python-tests-stability-triage/src/conn/conn_api.c:1461 #15 0x0000fffff6eeca84 in _wrap_Connection_rollback_to_stable (self=0xfffff70f8e50, args=0xfffff7042170) at /home/dda/wt/git/wt-15262-python-tests-stability-triage/build/lang/python/CMakeFiles/wiredtiger_python.dir/wiredtigerPYTHON_wrap.c:8818 #16 0x0000fffff79a06fc in cfunction_call (func=0xfffff713ae30, args=<optimized out>, kwargs=<optimized out>) at ../src/Python-3.10.4/Objects/methodobject.c:552 #17 0x0000fffff79983a0 in _PyObject_MakeTpCall (tstate=0x42f280, callable=0xfffff713ae30, args=<optimized out>, nargs=<optimized out>, keywords=0x0) at ../src/Python-3.10.4/Objects/call.c:215
Note: this ticket is primarily concerned with understanding and fixing the abort/crash. That alone may not fix the test, and it's okay to defer and regular logic failures to another ticket. However, if the fix allows the test to pass, please remove the test/tests from the test/hook_disagg.fail list so they will be tested regularly in Evergreen.
- is related to
-
WT-13746 Conflict between RTS and eviction regarding btree->rec_max_timestamp (take 2)
-
- Closed
-
-
WT-15575 Prevent the follower from writing checkpoint metadata
-
- Closed
-
-
WT-15563 Investigate making cache tolerant to change app step-wise eviction to incremental eviction
-
- Closed
-
-
WT-15568 Add the ability to dump the error log without the connection object
-
- Closed
-
-
WT-15576 Use EIO to indicate I/O errors in disagg
-
- Closed
-
-
WT-15410 test/format (disagg.mode=leader) Delta chain validation failed
-
- Closed
-
-
WT-15455 mirror mismatch (not disagg)
-
- Closed
-
-
WT-15450 test/format (disagg.mode=leader) verify failure unpacking addr
-
- Closed
-
-
WT-15550 layered38, 32 data mismatch error
-
- Closed
-
-
WT-15566 Remove table_id parameter from palm_handle_get_page_ids
-
- Closed
-
- related to
-
WT-15591 Review the code to ensure we also check the disagg shared metadata along with the local metadata
-
- Open
-
-
WT-15193 Add test case to ensure prefix suffix compression works for page deltas
-
- Closed
-
-
WT-15262 Disagg python testing: broad triage of apparent stability issues
-
- Closed
-