Details
-
Bug
-
Status: Closed
-
Major - P3
-
Resolution: Fixed
-
None
-
None
-
8
-
Storage Engines 2019-09-09, Storage Engines 2019-09-23, Storage Engines 2019-10-07
Description
We encountered a cache stuck with clean page failure in the test/format stress sanitiser job. Please note timestamp was disabled in the test. We had WT-4608 and WT-3694 created to track LSM cache stuck clean failures. This case is likely the 1st time we see the btree cache stuck with clean pages failure.
http://build.wiredtiger.com:8080/job/wiredtiger-test-format-stress-sanitizer/23878/
[1554809849:456697][2330:0x7fb095f0b700], t, eviction-server: __evict_server, 451: Cache stuck for too long, giving up: Connection timed out
|
t: process 2330
|
0x7fb09f590488:=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
|
0x7fb09f590488:transaction state dump
|
0x7fb09f590488:current ID: 4647
|
0x7fb09f590488:last running ID: 4647
|
0x7fb09f590488:metadata_pinned ID: 4370
|
0x7fb09f590488:oldest ID: 4370
|
0x7fb09f590488:commit timestamp: (0,0)
|
0x7fb09f590488:oldest timestamp: (0,0)
|
0x7fb09f590488:pinned timestamp: (0,0)
|
0x7fb09f590488:stable timestamp: (0,0)
|
0x7fb09f590488:has_commit_timestamp: no
|
0x7fb09f590488:has_oldest_timestamp: no
|
0x7fb09f590488:has_pinned_timestamp: no
|
0x7fb09f590488:has_stable_timestamp: no
|
0x7fb09f590488:oldest_is_pinned: no
|
0x7fb09f590488:stable_is_pinned: no
|
0x7fb09f590488:checkpoint running: no
|
0x7fb09f590488:checkpoint generation: 18
|
0x7fb09f590488:checkpoint pinned ID: 0
|
0x7fb09f590488:checkpoint txn ID: 0
|
0x7fb09f590488:oldest named snapshot ID: 0
|
0x7fb09f590488:session count: 36
|
0x7fb09f590488:Transaction state of active sessions:
|
0x7fb09f590488:ID: 0, pinned ID: 4491, metadata pinned ID: 0, name: WT_CURSOR.prev
|
0x7fb09f590488:mod count: 0, snap min: 4491, snap max: 4495, commit_timestamp: (0,0), durable_timestamp: (0,0), first_commit_timestamp: (0,0), read_timestamp: (0,0), flags: 0x00004008, isolation: WT_ISO_SNAPSHOT
|
0x7fb09f590488:ID: 0, pinned ID: 4491, metadata pinned ID: 0, name: WT_CURSOR.prev
|
0x7fb09f590488:mod count: 0, snap min: 4491, snap max: 4503, commit_timestamp: (0,0), durable_timestamp: (0,0), first_commit_timestamp: (0,0), read_timestamp: (0,0), flags: 0x00000008, isolation: WT_ISO_READ_COMMITTED
|
0x7fb09f590488:ID: 0, pinned ID: 4491, metadata pinned ID: 0, name: WT_CURSOR.next
|
0x7fb09f590488:mod count: 0, snap min: 4491, snap max: 4504, commit_timestamp: (0,0), durable_timestamp: (0,0), first_commit_timestamp: (0,0), read_timestamp: (0,0), flags: 0x00000008, isolation: WT_ISO_READ_COMMITTED
|
0x7fb09f590488:ID: 0, pinned ID: 4370, metadata pinned ID: 0, name: WT_CURSOR.search
|
0x7fb09f590488:mod count: 0, snap min: 4370, snap max: 4375, commit_timestamp: (0,0), durable_timestamp: (0,0), first_commit_timestamp: (0,0), read_timestamp: (0,0), flags: 0x00004008, isolation: WT_ISO_SNAPSHOT
|
0x7fb09f590488:ID: 0, pinned ID: 4491, metadata pinned ID: 0, name: WT_CURSOR.next
|
0x7fb09f590488:mod count: 0, snap min: 4491, snap max: 4503, commit_timestamp: (0,0), durable_timestamp: (0,0), first_commit_timestamp: (0,0), read_timestamp: (0,0), flags: 0x00000008, isolation: WT_ISO_READ_COMMITTED
|
0x7fb09f590488:ID: 0, pinned ID: 4491, metadata pinned ID: 0, name: WT_CURSOR.next
|
0x7fb09f590488:mod count: 0, snap min: 4491, snap max: 4503, commit_timestamp: (0,0), durable_timestamp: (0,0), first_commit_timestamp: (0,0), read_timestamp: (0,0), flags: 0x00000008, isolation: WT_ISO_READ_COMMITTED
|
0x7fb09f590488:=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
|
0x7fb09f590488:cache dump
|
0x7fb09f590488:cache full: no
|
0x7fb09f590488:cache clean check: yes (97.548%)
|
0x7fb09f590488:cache dirty check: no (0.091%)
|
0x7fb09f590488:file:wt.wt(checkpoint=WiredTigerCheckpoint.1):
|
0x7fb09f590488:internal: 3 pages, 0MB, 3/0 clean/dirty pages, 0/0 clean/dirty MB, 0MB max page, 0MB max dirty page
|
0x7fb09f590488:leaf: 1 pages, 0MB, 1/0 clean/dirty pages, 0/0 clean/dirty MB, 0MB max page, 0MB max dirty page
|
0x7fb09f590488:file:wt.wt(<live>):
|
0x7fb09f590488:internal: 64 pages, 24MB, 64/0 clean/dirty pages, 24/0 clean/dirty MB, 0MB max page, 0MB max dirty page
|
0x7fb09f590488:leaf: 4 pages, 0MB, 4/0 clean/dirty pages, 0/0 clean/dirty MB, 0MB max page, 0MB max dirty page
|
0x7fb09f590488:file:WiredTigerLAS.wt(<live>):
|
0x7fb09f590488:internal: 1 pages, 0MB, 0/1 clean/dirty pages, 0/0 clean/dirty MB, 0MB max page, 0MB max dirty page
|
0x7fb09f590488:leaf: 0 pages
|
0x7fb09f590488:file:WiredTiger.wt(<live>):
|
0x7fb09f590488:internal: 1 pages, 0MB, 1/0 clean/dirty pages, 0/0 clean/dirty MB, 0MB max page, 0MB max dirty page
|
0x7fb09f590488:leaf: 1 pages, 0MB, 0/1 clean/dirty pages, 0/0 clean/dirty MB, 0MB max page, 0MB max dirty page
|
0x7fb09f590488:cache dump: total found: 27MB vs tracked inuse 27MB
|
0x7fb09f590488:total dirty bytes: 0MB
|
[1554809849:462413][2330:0x7fb095f0b700], t, eviction-server: __wt_evict_thread_run, 321: cache eviction thread error: Connection timed out
|
[1554809849:462425][2330:0x7fb095f0b700], t, eviction-server: __wt_panic, 520: the process must exit and restart: WT_PANIC: WiredTiger library panic
|
[1554809849:462430][2330:0x7fb095f0b700], t, eviction-server: __wt_abort, 30: aborting WiredTiger library
|
/tmp/jenkins5083747742990501913.sh: line 24: 2330 Aborted nice ./t -1 -c ../../../test/format/CONFIG.stress
|
+ cleanup
|
+ status=134
|
The configuration:
############################################
|
# RUN PARAMETERS
|
############################################
|
abort=0
|
alter=0
|
auto_throttle=1
|
backups=1
|
bitcnt=8
|
bloom=1
|
bloom_bit_count=4
|
bloom_hash_count=32
|
bloom_oldest=0
|
cache=28
|
cache_minimum=20
|
checkpoints=on
|
checkpoint_log_size=44
|
checkpoint_wait=100
|
checksum=uncompressed
|
chunk_size=4
|
compaction=0
|
compression=none
|
data_extend=0
|
data_source=table
|
delete_pct=37
|
dictionary=0
|
direct_io=0
|
encryption=rotn-7
|
evict_max=0
|
file_type=row-store
|
firstfit=0
|
huffman_key=0
|
huffman_value=0
|
independent_thread_rng=1
|
in_memory=0
|
insert_pct=0
|
internal_key_truncation=1
|
internal_page_max=16
|
isolation=snapshot
|
key_gap=16
|
key_max=79
|
key_min=18
|
leaf_page_max=9
|
leak_memory=0
|
logging=1
|
logging_archive=1
|
logging_compression=zlib
|
logging_file_max=416851
|
logging_prealloc=1
|
long_running_txn=0
|
lsm_worker_threads=4
|
memory_page_max=1
|
merge_max=10
|
mmap=1
|
modify_pct=17
|
ops=0
|
prefix_compression=1
|
prefix_compression_min=7
|
prepare=0
|
quiet=1
|
read_pct=45
|
rebalance=1
|
repeat_data_pct=40
|
reverse=0
|
rows=1000000
|
runs=1
|
salvage=1
|
split_pct=75
|
statistics=0
|
statistics_server=1
|
threads=14
|
timer=4
|
timing_stress_aggressive_sweep=0
|
timing_stress_checkpoint=0
|
timing_stress_lookaside_sweep=0
|
timing_stress_split_1=0
|
timing_stress_split_2=0
|
timing_stress_split_3=0
|
timing_stress_split_4=0
|
timing_stress_split_5=0
|
timing_stress_split_6=0
|
timing_stress_split_7=0
|
timing_stress_split_8=0
|
transaction_timestamps=0
|
transaction-frequency=94
|
truncate=1
|
value_max=1148
|
value_min=17
|
verify=1
|
wiredtiger_config=
|
write_pct=1
|
############################################
|
Attachments
Issue Links
- causes
-
WT-5082 Application threads are tasked with eviction even when pinning the oldest transaction ID
-
- Closed
-
-
WT-5149 Clear the debugging field value so it cannot get stale
-
- Closed
-
- depends on
-
WT-4949 Spike: Cache stuck with clean pages with timestamp disabled format test
-
- Closed
-
- is related to
-
WT-5043 Add debugging to aid in test/format hang
-
- Closed
-
- related to
-
WT-5142 Don't create huge root pages when rebalancing or bulk-loading objects with overflow keys
-
- Closed
-
-
WT-5330 Read-only transactions are being rolled back in MongoDB
-
- Closed
-
-
WT-5266 application threads are tasked with eviction even when pinning the oldest transaction ID
-
- Closed
-