Description
From http://mjc.homeunix.org:8180/job/wiredtiger-test-format-stress/4464/console
The test/format CONFIG is:
############################################
|
# RUN PARAMETERS
|
############################################
|
auto_throttle=0
|
firstfit=1
|
# bitcnt not applicable to this run
|
bloom=1
|
bloom_bit_count=32
|
bloom_hash_count=8
|
bloom_oldest=0
|
cache=1
|
checkpoints=1
|
checksum=uncompressed
|
chunk_size=2
|
compaction=0
|
compression=zlib
|
data_extend=0
|
data_source=table
|
delete_pct=1
|
dictionary=0
|
file_type=row-store
|
hot_backups=0
|
huffman_key=0
|
huffman_value=0
|
insert_pct=25
|
internal_key_truncation=1
|
internal_page_max=10
|
isolation=snapshot
|
key_gap=15
|
key_max=37
|
key_min=30
|
leaf_page_max=15
|
logging=0
|
merge_max=14
|
merge_threads=2
|
mmap=1
|
ops=100000
|
prefix_compression=1
|
prefix_compression_min=8
|
repeat_data_pct=60
|
reverse=0
|
rows=100000
|
runs=100
|
split_pct=76
|
statistics=1
|
threads=9
|
value_max=3013
|
value_min=1
|
# wiredtiger_config not applicable to this run
|
write_pct=32
|
############################################
|
The recent change here is to run at snapshot isolation, but I don't immediately see why that would matter.
The stack is:
WT-3 0x0000000000446f54 in __wt_illegal_value (session=0x1755340, name=0x0) at ../src/support/err.c:497
|
WT-4 0x0000000000460b01 in __wt_page_in_func (session=0x1755340, ref=0x7f3f4c060230, flags=0, file=0x672db0 "../src/btree/row_srch.c", line=284) at ../src/btree/bt_page.c:110
|
WT-5 0x000000000048453c in __wt_page_swap_func (session=0x1755340, held=0x1783fb8, want=0x7f3f4c060230, flags=0, file=0x672db0 "../src/btree/row_srch.c", line=284)
|
at ../src/include/btree.i:752
|
WT-6 0x00000000004851f0 in __wt_row_search (session=0x1755340, srch_key=0x7f3f1c007728, leaf=0x0, cbt=0x7f3f1c007660, insert=0) at ../src/btree/row_srch.c:284
|
WT-7 0x00000000004c1a18 in __cursor_row_search (session=0x1755340, cbt=0x7f3f1c007660, insert=0) at ../src/btree/bt_cursor.c:158
|
WT-8 0x00000000004c1c7f in __wt_btcur_search (cbt=0x7f3f1c007660) at ../src/btree/bt_cursor.c:230
|
WT-9 0x0000000000491265 in __curfile_search (cursor=0x7f3f1c007660) at ../src/cursor/cur_file.c:177
|
WT-10 0x00000000004107eb in read_row (cursor=0x7f3f1c007660, key=0x7f3f48df9e10, keyno=14773) at ../../../test/format/ops.c:527
|
And the ref in question has been freed:
(gdb) print *ref
|
$1 = {page = 0x7f3f4c0602a0, home = 0xabababababababab, ref_hint = 2880154539, state = 2880154539, addr = 0xabababababababab, key = {recno = 12370169555311111083,
|
ikey = 0xabababababababab, pkey = 12370169555311111083}, page_del = 0xabababababababab}
|
At frame 5, in __wt_row_search, many of the entries in the index have been freed:
(gdb) print base
|
$5 = 55
|
(gdb) print cmp
|
$6 = 1
|
(gdb) print *pindex
|
$7 = {entries = 660, index = 0x7f3f1c098620}
|
(gdb) print child
|
$8 = (WT_REF *) 0x7f3f4c060230
|
(gdb) print pindex->index[54]
|
$9 = (WT_REF *) 0x7f3f4c060230
|
(gdb) print *pindex->index[53]
|
$10 = {page = 0x7f3f4c10a1a0, home = 0xabababababababab, ref_hint = 2880154539, state = 2880154539, addr = 0xabababababababab, key = {recno = 12370169555311111083,
|
ikey = 0xabababababababab, pkey = 12370169555311111083}, page_del = 0xabababababababab}
|
(gdb) print *pindex->index[54]
|
$11 = {page = 0x7f3f4c0602a0, home = 0xabababababababab, ref_hint = 2880154539, state = 2880154539, addr = 0xabababababababab, key = {recno = 12370169555311111083,
|
ikey = 0xabababababababab, pkey = 12370169555311111083}, page_del = 0xabababababababab}
|
(gdb) print *pindex->index[55]
|
$12 = {page = 0x7f3f4c060220, home = 0xabababababababab, ref_hint = 2880154539, state = 2880154539, addr = 0xabababababababab, key = {recno = 12370169555311111083,
|
ikey = 0xabababababababab, pkey = 12370169555311111083}, page_del = 0xabababababababab}
|
The page has been split since the search started, because pindex is not equal to page->u.intl.__index. The reading session has a valid snapshot, and has published a snap_min, all of which looks correct.
What I think must be going on is that *wt_session_fotxn_discard is freeing the split index too early. That is, for some reason the transaction ID we choose becomes visible to all readers during the *wt_row_search. I can't see why at the moment.
Attachments
Issue Links
- related to
-
WT-3 What file formats are required?
- Closed
-
WT-4 Flexible cursor traversals
- Closed
-
WT-5 How does pget work: is it necessary?
- Closed
-
WT-6 Complex schema example
- Closed
-
WT-7 Do we need the handle->err/errx methods?
- Closed
-
WT-8 Do we need table load, bulk-load and/or dump methods?
- Closed
-
WT-9 Does adding schema need to be transactional?
- Closed
-
WT-10 Basic "getting started" tutorial
- Closed
-
WT-1056 Don't free created internal pages from deepen-split until it's safe.
- Closed