Details
-
Bug
-
Status: Closed
-
Major - P3
-
Resolution: Done
-
None
-
None
-
None
Description
There's an in-memory stress test failure I've seen lately:
t, eviction-server: cache eviction server error: WT_RESTART: restart the operation (internal)
|
I thought for awhile it was the same as WT-2576 (it's another heavily-threaded, in-memory variable-length column-store failure), but I've seen it after that fix was merged.
It may be related: in short, when rewriting pages in-memory if the page is corrupted, we can fail to insert saved-update records because our "is there a race tests" in serial.i fail, returning WT_RESTART. Of course, there should never be a race when rewriting pages in-memory because we have exclusive access to the page.
Here's the CONFIG from a recent PPC stress test failure:
############################################
|
# RUN PARAMETERS
|
############################################
|
abort=0
|
auto_throttle=1
|
backups=0
|
bitcnt=8
|
bloom=1
|
bloom_bit_count=54
|
bloom_hash_count=20
|
bloom_oldest=0
|
cache=34
|
checkpoints=0
|
checksum=uncompressed
|
chunk_size=2
|
compaction=0
|
compression=none
|
data_extend=0
|
data_source=table
|
delete_pct=16
|
dictionary=0
|
direct_io=0
|
encryption=none
|
evict_max=1
|
file_type=variable-length column-store
|
firstfit=0
|
huffman_key=0
|
huffman_value=0
|
in_memory=1
|
insert_pct=15
|
internal_key_truncation=1
|
internal_page_max=12
|
isolation=random
|
key_gap=8
|
key_max=32
|
key_min=10
|
leaf_page_max=17
|
leak_memory=0
|
logging=0
|
logging_archive=1
|
logging_compression=lz4
|
logging_prealloc=1
|
long_running_txn=0
|
lsm_worker_threads=3
|
merge_max=17
|
mmap=1
|
ops=100000
|
prefix_compression=0
|
prefix_compression_min=2
|
quiet=1
|
repeat_data_pct=50
|
reverse=0
|
rows=100000
|
runs=100
|
rebalance=0
|
salvage=0
|
split_pct=77
|
statistics=0
|
statistics_server=0
|
threads=24
|
timer=20
|
transaction-frequency=99
|
value_max=80
|
value_min=17
|
verify=0
|
wiredtiger_config=
|
write_pct=67
|
############################################
|