-
Type: Bug
-
Resolution: Fixed
-
Priority: Major - P3
-
Affects Version/s: None
-
Component/s: None
-
8
-
Storage Engines 2020-01-27
-
v4.2, v4.0
Summary:
The tree walk code has neither a hazard pointer or a lock to block eviction, change the walk code to lock the WT_REF structure before reading the WT_REF.addr field.
There's a test failure captured by race condition test/format sanitizer job that the memory used after free during a btree walk with next_random configuration:
http://build.wiredtiger.com:8080/job/wiredtiger-test-race-condition-stress-sanitizer/33913/
==63102==ERROR: AddressSanitizer: heap-use-after-free on address 0x60600066f2e8 at pc 0x000000a2007a bp 0x7fa5370efab0 sp 0x7fa5370efaa8 READ of size 8 at 0x60600066f2e8 thread T79 #0 0xa20079 in __wt_ref_info /work/jenkins/workspace/wiredtiger-test-race-condition-stress-sanitizer/build_posix/../src/include/btree.i:1073:24 #1 0xa1fbe6 in __ref_is_leaf /work/jenkins/workspace/wiredtiger-test-race-condition-stress-sanitizer/build_posix/../src/btree/bt_walk.c:90:5 #2 0xa1965a in __tree_walk_skip_count_callback /work/jenkins/workspace/wiredtiger-test-race-condition-stress-sanitizer/build_posix/../src/btree/bt_walk.c:607:35 #3 0xa188e7 in __tree_walk_internal /work/jenkins/workspace/wiredtiger-test-race-condition-stress-sanitizer/build_posix/../src/btree/bt_walk.c:482:17 #4 0xa1941a in __wt_tree_walk_skip /work/jenkins/workspace/wiredtiger-test-race-condition-stress-sanitizer/build_posix/../src/btree/bt_walk.c:631:9 #5 0x970763 in __wt_btcur_next_random /work/jenkins/workspace/wiredtiger-test-race-condition-stress-sanitizer/build_posix/../src/btree/bt_random.c:580:9 #6 0xaca00f in __wt_curfile_next_random /work/jenkins/workspace/wiredtiger-test-race-condition-stress-sanitizer/build_posix/../src/cursor/cur_file.c:118:5 #7 0x52ce91 in random_kv /work/jenkins/workspace/wiredtiger-test-race-condition-stress-sanitizer/build_posix/test/format/../../../test/format/random.c:73:27 #8 0x4dde52 in __asan::AsanThread::ThreadStart(unsigned long, __sanitizer::atomic_uintptr_t*) (/work/jenkins/workspace/wiredtiger-test-race-condition-stress-sanitizer/build_posix/test/format/t+0x4dde52) #9 0x7fa54b8e836c in start_thread (/lib64/libpthread.so.0+0x736c) #10 0x7fa54aabcb4e in __GI___clone (/lib64/libc.so.6+0x110b4e) 0x60600066f2e8 is located 40 bytes inside of 56-byte region [0x60600066f2c0,0x60600066f2f8) freed by thread T71 here: #0 0x4d01e8 in __interceptor_free.localalias.0 (/work/jenkins/workspace/wiredtiger-test-race-condition-stress-sanitizer/build_posix/test/format/t+0x4d01e8) #1 0x6e2299 in __wt_free_int /work/jenkins/workspace/wiredtiger-test-race-condition-stress-sanitizer/build_posix/../src/os_common/os_alloc.c:301:5 #2 0x757922 in __wt_ref_addr_free /work/jenkins/workspace/wiredtiger-test-race-condition-stress-sanitizer/build_posix/../src/include/btree.i:640:9 #3 0x75557c in __wt_ref_block_free /work/jenkins/workspace/wiredtiger-test-race-condition-stress-sanitizer/build_posix/../src/include/btree.i:1116:5 #4 0x7457e7 in __rec_write_wrapup /work/jenkins/workspace/wiredtiger-test-race-condition-stress-sanitizer/build_posix/../src/reconcile/rec_write.c:2194:9 #5 0x735ff6 in __reconcile /work/jenkins/workspace/wiredtiger-test-race-condition-stress-sanitizer/build_posix/../src/reconcile/rec_write.c:209:28 #6 0x734d2c in __wt_reconcile /work/jenkins/workspace/wiredtiger-test-race-condition-stress-sanitizer/build_posix/../src/reconcile/rec_write.c:102:11 #7 0x655f40 in __evict_review /work/jenkins/workspace/wiredtiger-test-race-condition-stress-sanitizer/build_posix/../src/evict/evict_page.c:671:11 #8 0x652249 in __wt_evict /work/jenkins/workspace/wiredtiger-test-race-condition-stress-sanitizer/build_posix/../src/evict/evict_page.c:149:5 #9 0x63552e in __evict_page /work/jenkins/workspace/wiredtiger-test-race-condition-stress-sanitizer/build_posix/../src/evict/evict_lru.c:2214:5 #10 0x63256f in __wt_cache_eviction_worker /work/jenkins/workspace/wiredtiger-test-race-condition-stress-sanitizer/build_posix/../src/evict/evict_lru.c:2304:23 #11 0x84e5ea in __wt_cache_eviction_check /work/jenkins/workspace/wiredtiger-test-race-condition-stress-sanitizer/build_posix/../src/include/cache.i:428:13 #12 0x84f848 in __wt_txn_rollback /work/jenkins/workspace/wiredtiger-test-race-condition-stress-sanitizer/build_posix/../src/txn/txn.c:1278:9 #13 0x7bb28e in __session_rollback_transaction /work/jenkins/workspace/wiredtiger-test-race-condition-stress-sanitizer/build_posix/../src/session/session_api.c:1742:5 #14 0x52b8c2 in rollback_transaction /work/jenkins/workspace/wiredtiger-test-race-condition-stress-sanitizer/build_posix/test/format/../../../test/format/ops.c:452:5 #15 0x51f27c in ops /work/jenkins/workspace/wiredtiger-test-race-condition-stress-sanitizer/build_posix/test/format/../../../test/format/ops.c:978:13 #16 0x4dde52 in __asan::AsanThread::ThreadStart(unsigned long, __sanitizer::atomic_uintptr_t*) (/work/jenkins/workspace/wiredtiger-test-race-condition-stress-sanitizer/build_posix/test/format/t+0x4dde52) previously allocated by thread T75 here: #0 0x4d05a8 in calloc (/work/jenkins/workspace/wiredtiger-test-race-condition-stress-sanitizer/build_posix/test/format/t+0x4d05a8) #1 0x6e0ae0 in __wt_calloc /work/jenkins/workspace/wiredtiger-test-race-condition-stress-sanitizer/build_posix/../src/os_common/os_alloc.c:50:14 #2 0x657bae in __evict_page_dirty_update /work/jenkins/workspace/wiredtiger-test-race-condition-stress-sanitizer/build_posix/../src/evict/evict_page.c:393:13 #3 0x652c3e in __wt_evict /work/jenkins/workspace/wiredtiger-test-race-condition-stress-sanitizer/build_posix/../src/evict/evict_page.c:192:9 #4 0x63552e in __evict_page /work/jenkins/workspace/wiredtiger-test-race-condition-stress-sanitizer/build_posix/../src/evict/evict_lru.c:2214:5 #5 0x63256f in __wt_cache_eviction_worker /work/jenkins/workspace/wiredtiger-test-race-condition-stress-sanitizer/build_posix/../src/evict/evict_lru.c:2304:23 #6 0x84e5ea in __wt_cache_eviction_check /work/jenkins/workspace/wiredtiger-test-race-condition-stress-sanitizer/build_posix/../src/include/cache.i:428:13 #7 0x84f848 in __wt_txn_rollback /work/jenkins/workspace/wiredtiger-test-race-condition-stress-sanitizer/build_posix/../src/txn/txn.c:1278:9 #8 0x7bb28e in __session_rollback_transaction /work/jenkins/workspace/wiredtiger-test-race-condition-stress-sanitizer/build_posix/../src/session/session_api.c:1742:5 #9 0x52b8c2 in rollback_transaction /work/jenkins/workspace/wiredtiger-test-race-condition-stress-sanitizer/build_posix/test/format/../../../test/format/ops.c:452:5 #10 0x51f27c in ops /work/jenkins/workspace/wiredtiger-test-race-condition-stress-sanitizer/build_posix/test/format/../../../test/format/ops.c:978:13 #11 0x4dde52 in __asan::AsanThread::ThreadStart(unsigned long, __sanitizer::atomic_uintptr_t*) (/work/jenkins/workspace/wiredtiger-test-race-condition-stress-sanitizer/build_posix/test/format/t+0x4dde52) Thread T79 created by T0 here: #0 0x433890 in pthread_create (/work/jenkins/workspace/wiredtiger-test-race-condition-stress-sanitizer/build_posix/test/format/t+0x433890) #1 0x6ff760 in __wt_thread_create /work/jenkins/workspace/wiredtiger-test-race-condition-stress-sanitizer/build_posix/../src/os_posix/os_thread.c:28:5 #2 0x519ed3 in wts_ops /work/jenkins/workspace/wiredtiger-test-race-condition-stress-sanitizer/build_posix/test/format/../../../test/format/ops.c:186:9 #3 0x5355bf in main /work/jenkins/workspace/wiredtiger-test-race-condition-stress-sanitizer/build_posix/test/format/../../../test/format/t.c:212:17 #4 0x7fa54a9cc889 in __libc_start_main (/lib64/libc.so.6+0x20889) Thread T71 created by T0 here: #0 0x433890 in pthread_create (/work/jenkins/workspace/wiredtiger-test-race-condition-stress-sanitizer/build_posix/test/format/t+0x433890) #1 0x6ff760 in __wt_thread_create /work/jenkins/workspace/wiredtiger-test-race-condition-stress-sanitizer/build_posix/../src/os_posix/os_thread.c:28:5 #2 0x519896 in wts_ops /work/jenkins/workspace/wiredtiger-test-race-condition-stress-sanitizer/build_posix/test/format/../../../test/format/ops.c:169:9 #3 0x5355bf in main /work/jenkins/workspace/wiredtiger-test-race-condition-stress-sanitizer/build_posix/test/format/../../../test/format/t.c:212:17 #4 0x7fa54a9cc889 in __libc_start_main (/lib64/libc.so.6+0x20889) Thread T75 created by T0 here: #0 0x433890 in pthread_create (/work/jenkins/workspace/wiredtiger-test-race-condition-stress-sanitizer/build_posix/test/format/t+0x433890) #1 0x6ff760 in __wt_thread_create /work/jenkins/workspace/wiredtiger-test-race-condition-stress-sanitizer/build_posix/../src/os_posix/os_thread.c:28:5 #2 0x519896 in wts_ops /work/jenkins/workspace/wiredtiger-test-race-condition-stress-sanitizer/build_posix/test/format/../../../test/format/ops.c:169:9 #3 0x5355bf in main /work/jenkins/workspace/wiredtiger-test-race-condition-stress-sanitizer/build_posix/test/format/../../../test/format/t.c:212:17 #4 0x7fa54a9cc889 in __libc_start_main (/lib64/libc.so.6+0x20889)
The configuration:
############################################ # RUN PARAMETERS ############################################ abort=0 alter=0 assert_commit_timestamp=0 assert_read_timestamp=0 auto_throttle=1 backups=0 bitcnt=4 bloom=1 bloom_bit_count=59 bloom_hash_count=11 bloom_oldest=0 cache=440 cache_minimum=20 checkpoints=on checkpoint_log_size=64 checkpoint_wait=36 checksum=uncompressed chunk_size=3 compaction=0 compression=snappy data_extend=0 data_source=table delete_pct=8 dictionary=1 direct_io=0 encryption=none evict_max=1 file_type=row-store firstfit=0 huffman_key=0 huffman_value=0 independent_thread_rng=1 in_memory=0 insert_pct=46 internal_key_truncation=0 internal_page_max=13 isolation=snapshot key_gap=5 key_max=113 key_min=26 leaf_page_max=13 leak_memory=0 logging=1 logging_archive=1 logging_compression=snappy logging_file_max=411791 logging_prealloc=1 long_running_txn=0 lsm_worker_threads=4 memory_page_max=10 merge_max=6 mmap=1 modify_pct=42 ops=0 prefix_compression=0 prefix_compression_min=4 prepare=0 quiet=1 random_cursor=1 read_pct=3 rebalance=1 repeat_data_pct=56 reverse=0 rows=1000000 runs=1 salvage=1 split_pct=67 statistics=0 statistics_server=0 threads=22 timer=4 timing_stress_aggressive_sweep=0 timing_stress_checkpoint=0 timing_stress_lookaside_sweep=0 timing_stress_split_1=1 timing_stress_split_2=0 timing_stress_split_3=1 timing_stress_split_4=0 timing_stress_split_5=0 timing_stress_split_6=1 timing_stress_split_7=1 timing_stress_split_8=0 transaction_timestamps=1 transaction-frequency=100 truncate=1 value_max=3782 value_min=17 verify=1 wiredtiger_config= write_pct=1 ############################################
- causes
-
WT-5481 DIAGNOSTIC split code assert can race with WT_REF locking
- Closed
-
WT-5489 page-read can race with threads locking in-memory page structures
- Closed
-
WT-5518 Split-parent code can race with other threads when checking the WT_REF.state
- Closed
-
WT-5557 Fix the wrong page type returned when checking on-page cell
- Closed
-
WT-7049 test/format heap use after free on 4.2 branch
- Closed
-
WT-5525 Free up 3B in the WT_REF structure
- Closed
-
WT-5647 replace the WT_REF structure's WT_REF_READING state with a flag
- Closed
-
WT-5648 Add a leaf or internal page type flag to the WT_REF structure
- Closed
-
WT-5649 Refactor WT_REF locking, review all WT_REF.addr reads for locking issues
- Closed
- is depended on by
-
WT-5372 Skip known errors for long-running format stress sanitizer tasks
- Closed
- is duplicated by
-
WT-5198 "Invalid read of size" error captured by valgrind in a test/format run
- Closed
- related to
-
WT-9156 Fix use-after-free failure in v4.2
- Closed
-
WT-5552 Checkpoint reconciliation and page splits free the WT_REF.addr field without locking
- Closed