Details
Description
We are seeing a lot of failed stress test runs at the moment because enabling timestamps in test/format often results in the cache being pinned full.
We need to investigate whether we can reasonably avoid the cache being pinned full, and if not update test/format to avoid encountering the condition.
An example of a failure:
[1512735814:255349][22179:0x7f3b7d224700], t, eviction-server: Cache stuck for too long, giving up: Connection timed out
|
0xe341c0:=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
|
0xe341c0:transaction state dump
|
0xe341c0:current ID: 72453
|
0xe341c0:last running ID: 72453
|
0xe341c0:oldest ID: 72453
|
0xe341c0:commit timestamp: aed7c
|
0xe341c0:oldest timestamp: ac902
|
0xe341c0:pinned timestamp: ac902
|
0xe341c0:stable timestamp: 0
|
0xe341c0:has_commit_timestamp: yes
|
0xe341c0:has_oldest_timestamp: yes
|
0xe341c0:has_pinned_timestamp: yes
|
0xe341c0:has_stable_timestamp: no
|
0xe341c0:oldest_is_pinned: yes
|
0xe341c0:stable_is_pinned: no
|
0xe341c0:checkpoint running: no
|
0xe341c0:checkpoint generation: 228
|
0xe341c0:checkpoint pinned ID: 0
|
0xe341c0:checkpoint txn ID: 0
|
0xe341c0:oldest named snapshot ID: 0
|
0xe341c0:session count: 35
|
0xe341c0:Transaction state of active sessions:
|
0xe341c0:=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
|
0xe341c0:cache dump
|
0xe341c0:cache full: no
|
0xe341c0:cache clean check: no (0%)
|
0xe341c0:cache dirty check: yes (50%)
|
0xe341c0:file:wt(<live>):
|
0xe341c0:internal: 1 pages, 0MB, 0/1 clean/dirty pages, 0/0 clean/dirty MB, 0MB max page, 0MB max dirty page
|
0xe341c0:leaf: 310 pages, 15MB, 168/142 clean/dirty pages, 9/6 clean/dirty MB, 0MB max page, 0MB max dirty page
|
0xe341c0:file:WiredTigerLAS.wt(<live>):
|
0xe341c0:internal: 1 pages, 0MB, 0/1 clean/dirty pages, 0/0 clean/dirty MB, 0MB max page, 0MB max dirty page
|
0xe341c0:file:WiredTiger.wt(<live>):
|
0xe341c0:internal: 1 pages, 0MB, 1/0 clean/dirty pages, 0/0 clean/dirty MB, 0MB max page, 0MB max dirty page
|
0xe341c0:leaf: 1 pages, 0MB, 1/0 clean/dirty pages, 0/0 clean/dirty MB, 0MB max page, 0MB max dirty page
|
0xe341c0:cache dump: total found: 17MB vs tracked inuse 18MB
|
0xe341c0:total dirty bytes: 6MB
|
[1512735814:255734][22179:0x7f3b7d224700], t, eviction-server: cache eviction thread error: Connection timed out
|
[1512735814:255741][22179:0x7f3b7d224700], t, eviction-server: the process must exit and restart: WT_PANIC: WiredTiger library panic
|
[1512735814:255746][22179:0x7f3b7d224700], t, eviction-server: aborting WiredTiger library
|
The configuration:
############################################
|
# RUN PARAMETERS
|
############################################
|
abort=0
|
alter=0
|
auto_throttle=1
|
backups=0
|
bitcnt=7
|
bloom=1
|
bloom_bit_count=27
|
bloom_hash_count=29
|
bloom_oldest=0
|
cache=37
|
cache_minimum=20
|
checkpoints=on
|
checkpoint_log_size=59
|
checkpoint_wait=94
|
checksum=uncompressed
|
chunk_size=10
|
compaction=0
|
compression=none
|
data_extend=0
|
data_source=file
|
delete_pct=7
|
dictionary=0
|
direct_io=0
|
encryption=none
|
evict_max=5
|
file_type=row-store
|
firstfit=0
|
huffman_key=0
|
huffman_value=0
|
independent_thread_rng=1
|
in_memory=0
|
insert_pct=12
|
internal_key_truncation=1
|
internal_page_max=16
|
isolation=random
|
key_gap=6
|
key_max=115
|
key_min=21
|
leaf_page_max=17
|
leak_memory=0
|
logging=0
|
logging_archive=0
|
logging_compression=none
|
logging_file_max=411512
|
logging_prealloc=1
|
long_running_txn=0
|
lsm_worker_threads=3
|
merge_max=6
|
mmap=1
|
modify_pct=63
|
ops=100000
|
prefix_compression=1
|
prefix_compression_min=3
|
quiet=1
|
read_pct=5
|
rebalance=1
|
repeat_data_pct=38
|
reverse=0
|
rows=100000
|
runs=100
|
salvage=1
|
split_pct=82
|
statistics=0
|
statistics_server=0
|
threads=22
|
timer=360
|
transaction_timestamps=1
|
transaction-frequency=82
|
value_max=926
|
value_min=19
|
verify=1
|
wiredtiger_config=
|
write_pct=13
|
############################################
|
The original failure:
http://build.wiredtiger.com:8080/job/wiredtiger-test-format-stress/59888