-
Type: Task
-
Resolution: Done
-
None
-
Affects Version/s: None
-
Component/s: None
I saw a stress test failure a few days ago and it seems to reproduce relatively quickly on pixiebob. Here's the CONFIG:
############################################ # RUN PARAMETERS ############################################ auto_throttle=1 firstfit=1 bitcnt=3 bloom=1 bloom_bit_count=10 bloom_hash_count=24 bloom_oldest=1 cache=15 checkpoints=1 checksum=uncompressed chunk_size=7 compaction=0 compression=zlib data_extend=0 data_source=file delete_pct=22 dictionary=0 evict_max=0 file_type=row-store backups=0 huffman_key=0 huffman_value=0 insert_pct=13 internal_key_truncation=1 internal_page_max=9 isolation=read-committed key_gap=14 key_max=256 key_min=256 leak_memory=0 leaf_page_max=9 logging=0 merge_max=4 merge_threads=3 mmap=1 ops=100000 prefix_compression=1 prefix_compression_min=3 repeat_data_pct=7 reverse=0 rows=100000 runs=100 split_pct=72 statistics=0 threads=13 value_max=2638 value_min=10 wiredtiger_config= write_pct=36 ############################################
Looking at the threads:
T 12: main, sleeping
T 11: eviction server, the failing thread
T 10: sweep server, sleeping
T 3, 4, 5, 6, 7, 8, 9: worker threads, trying to do a named checkpoint and waiting on format's hot-backup lock
T 2: doing a named checkpoint and trying to clear the tree's current eviction point
T 1: starting a transaction and doing a cache-full check, waiting to find a page on the eviction queue to evict.
Thread 11's stack:
WT-4 0x00000000004da528 in __wt_abort (session=0x803035280) at src/os_posix/os_abort.c:21 WT-5 0x0000000000459e3c in __wt_assert (session=0x803035280, error=0, file_name=0x531580 "src/btree/bt_discard.c", line_number=56, fmt=0x522129 "%s") at src/support/err.c:452 WT-6 0x00000000004f5077 in __wt_page_out (session=0x803035280, pagep=0x807f293a0) at src/btree/bt_discard.c:56 WT-7 0x00000000004f4fff in __wt_ref_out (session=0x803035280, ref=0x807f293a0) at src/btree/bt_discard.c:32 WT-8 0x0000000000489898 in __rec_page_dirty_update (session=0x803035280, ref=0x807f293a0, exclusive=0) at src/btree/rec_evict.c:164 WT-9 0x0000000000488c0e in __wt_rec_evict (session=0x803035280, ref=0x807f293a0, exclusive=0) at src/btree/rec_evict.c:88 WT-10 0x0000000000473e6c in __wt_evict_page (session=0x803035280, ref=0x807f293a0) at src/btree/bt_evict.c:616 WT-11 0x0000000000474403 in __wt_evict_lru_page (session=0x803035280, is_app=0) at src/btree/bt_evict.c:1218 WT-12 0x0000000000474e12 in __evict_lru_pages (session=0x803035280, is_app=0) at src/btree/bt_evict.c:701 WT-13 0x0000000000475879 in __evict_lru (session=0x803035280, flags=3) at src/btree/bt_evict.c:794 WT-14 0x000000000047514c in __evict_pass (session=0x803035280) at src/btree/bt_evict.c:484 WT-15 0x000000000047352f in __evict_server (arg=0x803035280) at src/btree/bt_evict.c:162 and we're attempting to discard a dirty page: 52 /* 53 * We should never discard a dirty page, the file's current eviction 54 * point or a page queued for LRU eviction. 55 */ 56 WT_ASSERT(session, !__wt_page_is_modified(page)); 57 WT_ASSERT(session, !F_ISSET_ATOMIC(page, WT_PAGE_EVICT_LRU)); 58 WT_ASSERT(session, !F_ISSET_ATOMIC(page, WT_PAGE_SPLITTING)); and it's empty: (gdb) p *page $4 = {u = {intl = {recno = 34466205048, parent_ref = 0x8065c10e8, __index = 0x0}, row = {d = 0x806588d78, ins = 0x8065c10e8, upd = 0x0, entries = 0}, col_fix = {recno = 34466205048, bitf = 0x8065c10e8 "", entries = 0}, col_var = { recno = 34466205048, d = 0x8065c10e8, repeats = 0x0, nrepeats = 0, entries = 0}}, dsk = 0x0, modify = 0x806597520, read_gen = 130743, memory_footprint = 3195, type = 7 '\a', flags_atomic = 0 '\0'} (gdb) p *page->modify $6 = {disk_snap_min = 0, rec_max_txn = 0, first_dirty_txn = 91156, update_txn = 91156, bytes_dirty = 3195, u1 = {replace = { addr = 0x0, size = 0 '\0', type = 2 '\002', reuse = 0 '\0'}, m = {multi = 0x0, multi_entries = 512}}, u2 = {intl = { root_split = 0x0, split_txn = 0}, leaf = {append = 0x0, update = 0x0}}, ovfl_track = 0x807704b00, write_gen = 1, page_lock = 2 '\002', flags = 1 '\001'} but there's one element on the page's insert list: 0x806588d30: row-store leaf disk 0x0, entries 0, dirty, empty, write generation=1 insert {0000078100.00/opqrstuvwxyzabcdefghijklmnop ... } value {0000078100/LMNOPQRSTUVWXYZABCDEFGHIJKLMNOPQR ... } (gdb) skip_insert_single page->u.row.ins 0 ... printing level 0 $12 = {upd = 0x8067e2800, u = {recno = 1099511627800, key = {offset = 24, size = 256}}, next = 0x80670d990} (gdb) p *$12.upd $13 = {txnid = 91156, next = 0x0, size = 2407}
This page wasn't a child page of another page being evicted, this page was selected for eviction.
- related to
-
WT-4 Flexible cursor traversals
- Closed
-
WT-5 How does pget work: is it necessary?
- Closed
-
WT-6 Complex schema example
- Closed
-
WT-7 Do we need the handle->err/errx methods?
- Closed
-
WT-8 Do we need table load, bulk-load and/or dump methods?
- Closed
-
WT-9 Does adding schema need to be transactional?
- Closed
-
WT-10 Basic "getting started" tutorial
- Closed
-
WT-11 placeholder #11
- Closed
-
WT-12 Write more examples
- Closed
-
WT-13 Define supported platforms
- Closed
-
WT-14 Windows build
- Closed
-
WT-15 Automated build/test infrastructure
- Closed