-
Type:
Bug
-
Resolution: Fixed
-
Priority:
Major - P3
-
Affects Version/s: None
-
Component/s: Concurrency
-
None
-
Storage Engines - Foundations
-
13.012
-
None
-
None
Problem
ThreadSanitizer reported three data races in the switch-mode / layered-storage TSan variant of the format test. In each case a wt_shared field that is written with an atomic primitive is read with a plain (non-atomic) load, which TSan treats as a data race.
Race 1 – btree->original in __wt_btree_disable_bulk
btree->original is a wt_shared uint8_t. The early-exit guard in btree_inline.h used a plain read while another thread concurrently performed __wt_atomic_cas_uint8 on the same byte (the one-time CAS that disables bulk-load mode). Two layered insert threads racing on the same freshly-opened B-Tree trigger this.
Race 2 – conn->version_cursor_count read in __wt_txn_pinned_timestamp
version_cursor_count is a wt_shared uint32_t incremented atomically in _wt_curversion_open (wt_atomic_add_uint32) and decremented atomically in curversion_close (_wt_atomic_sub_uint32). The fast-path check in txn_inline.h used a plain load while a drain-worker thread performed an atomic add/sub concurrently.
Race 3 – conn->version_cursor_count read in __wt_curversion_open
Same field as Race 2. The check inside the txn-global write-lock section of _wt_curversion_open used a plain load, but _curversion_close decrements the counter atomically without acquiring that lock, so the lock does not prevent the race.
Fix
Replace each plain read with _wt_atomic_load*_relaxed:
| File | Old | New |
|---|---|---|
| src/include/btree_inline.h | Unable to render embedded object: File (btree->original}} ) not found.__wt_atomic_load_uint8_relaxed(&btree->original) | |
| src/include/txn_inline.h | S2C(session)->version_cursor_count > 0 | __wt_atomic_load_uint32_relaxed(&S2C(session)->version_cursor_count) > 0 |
| src/cursor/cur_version.c | conn->version_cursor_count == 0 | __wt_atomic_load_uint32_relaxed(&conn->version_cursor_count) == 0 |
Relaxed ordering is sufficient in all three cases: the checks are advisory fast-paths or one-shot transition guards, and the subsequent CAS / rwlock operations provide the necessary acquire/release ordering for dependent side-effects.