-
Type:
Bug
-
Resolution: Unresolved
-
Priority:
Major - P3
-
None
-
Affects Version/s: None
-
Component/s: Test Python
-
Storage Engines - Foundations
-
None
-
None
We have a python test, test_layered90.py, that generates a large number of layered table states and creates 1 table for each of these states. Each table can have up to 5 keys, resulting in 2911 unique layered tables.
We want to allow tables to have up to 6 keys, which means that the number of layered tables rises to 11011. This causes the test to abort with cache stuck for too long (original discussion):
$ python3 ../test/suite/run.py -v 2 test_layered91 [pid:99118]: None ... [pid:99118]: test_layered91.test_layered91.test_layered91: starting [pid:99118]: Generate unique situations for testing. [pid:99118]: Generated 11011 unique situations for testing. [pid:99118]: Create layered tables for each situation. [pid:99118]: Done - Create layered tables for each situation. [pid:99118]: Populate stable keys (S, B, R, X) on the leader. zsh: abort python3 ../test/suite/run.py -v 2 test_layered91
[1775779846:518218][79855:0x16f88f000], test_layered91.test_layered91.test_layered91(nbatches=1), file:WiredTigerSharedHS.wt_stable, eviction-server: [WT_VERB_DEFAULT][ERROR]: int __evict_server(WT_SESSION_IMPL *, _Bool *), 543: Cache stuck for too long, giving up: Operation timed out [1775779847:153301][79855:0x16f88f000], test_layered91.test_layered91.test_layered91(nbatches=1), file:WiredTigerSharedHS.wt_stable, eviction-server: [WT_VERB_ERROR_RETURNS][ERROR]: int __wt_btcur_next(WT_CURSOR_BTREE *, _Bool), 795: Error at src/btree/bt_curnext.c:795: "WT_NOTFOUND" failed: WT_NOTFOUND: item not found [1775779847:153318][79855:0x16f88f000], test_layered91.test_layered91.test_layered91(nbatches=1), file:WiredTigerSharedHS.wt_stable, eviction-server: [WT_VERB_ERROR_RETURNS][ERROR]: int __curfile_next(WT_CURSOR *), 186: Error at src/cursor/cur_file.c:186: "ret" failed: WT_NOTFOUND: item not found [1775779847:153328][79855:0x16f88f000], test_layered91.test_layered91.test_layered91(nbatches=1), file:WiredTigerSharedHS.wt_stable, eviction-server: [WT_VERB_ERROR_RETURNS][ERROR]: int __evict_thread_run(WT_SESSION_IMPL *, WT_THREAD *), 336: Error at src/evict/evict_lru.c:336: "ret" failed: Operation timed out [1775779847:153336][79855:0x16f88f000], test_layered91.test_layered91.test_layered91(nbatches=1), file:WiredTigerSharedHS.wt_stable, eviction-server: [WT_VERB_DEFAULT][ERROR]: int __evict_thread_run(WT_SESSION_IMPL *, WT_THREAD *), 359: eviction thread error: Operation timed out [1775779847:153346][79855:0x16f88f000], test_layered91.test_layered91.test_layered91(nbatches=1), file:WiredTigerSharedHS.wt_stable, eviction-server: [WT_VERB_DEFAULT][ERROR]: int __evict_thread_run(WT_SESSION_IMPL *, WT_THREAD *), 359: the process must exit and restart: WT_PANIC: WiredTiger library panic [1775779847:153354][79855:0x16f88f000], test_layered91.test_layered91.test_layered91(nbatches=1), file:WiredTigerSharedHS.wt_stable, eviction-server: [WT_VERB_DEFAULT][ERROR]: void __wt_abort(WT_SESSION_IMPL *), 29: aborting WiredTiger library
This ticket is to investigate why this occurs, and if possible fix the issue and raise the max_len of the tables in test_layered90.py from 5 to 6.