-
Type:
Bug
-
Resolution: Unresolved
-
Priority:
Major - P3
-
None
-
Affects Version/s: None
-
Component/s: Sweep Server
-
Storage Engines - Foundations
-
None
-
None
Evergreen patch:EVG Patch
Affected Tests (10 of 57):
1. unsupported_read_write_concerns.js
2. mongod_waits_for_cms.js
3. read_concern_target_time_wait.js
4. simple_replica_set.js
5. simple_one_node_restart.js
6. change_stream_can_read_from_secondary.js
7. large_oplog_batch_application.js
8. establish_connection_during_stepup.js
9. transaction_table_oplog_replay.js
10. replSetInitiate_no_config.js
Details:
During StorageEngineImpl::cleanShutdown(), the spill WiredTiger engine's shutdown path closes the WiredTiger connection via wiredtiger_close(). Inside that call, __wti_prefetch_destroy() tears down the prefetch thread group by calling __wt_thread_group_destroy() →
{}thread_group_shrink() → {}wt_session_close_internal(), which performs a memset (write) over a WiredTiger session struct in-place. Concurrently, WiredTiger's internal sweep server thread ({_}_sweep_server) is still running and calling __wt_session_array_walk(), which
performs an atomic read of the same session struct memory — the two accesses race.
Thread A — shutdown:
memset ← WRITE (8 bytes)
__wt_session_close_internal
__thread_group_shrink
__wt_thread_group_destroy
__wti_prefetch_destroy
__conn_close
SpillWiredTigerKVEngine::cleanShutdown()
StorageEngineImpl::cleanShutdown()
Thread B — sweep server:
__wt_session_array_walk ← ATOMIC READ (same address)
__sweep_server
Root Cause:
WiredTiger's __conn_close shuts down the prefetch thread group — and closes those threads' sessions — before stopping the sweep server thread. The sweep server's __wt_session_array_walk continues reading the session array during this window, overlapping with the
memset that zeros out the session struct being freed. This is a WiredTiger-internal shutdown ordering bug: the sweep server should be stopped before any sessions are destroyed.