[SERVER-49714] Oplog visibility thread may read from unowned memory when multiple oplog collections present Created: 17/Jul/20 Updated: 29/Oct/23 Resolved: 27/Jul/21 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Storage |
| Affects Version/s: | 4.5.1 |
| Fix Version/s: | 4.4.8 |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Ian Boros | Assignee: | Dianna Hohensee (Inactive) |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | techdebt | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||||||
| Operating System: | ALL | ||||||||||||
| Sprint: | Execution Team 2021-07-12, Execution Team 2021-07-26, Execution Team 2021-08-09 | ||||||||||||
| Participants: | |||||||||||||
| Linked BF Score: | 21 | ||||||||||||
| Description |
|
The WiredTigerKVEngine maintains a counter for how many WiredTigerRecordStores are "oplog-like" (namespace of the form local.oplog.*). When a record store with an oplog-like namespace is created, the counter is bumped. Similarly, when the record store for one of these collections is destroyed, the count is decreased. Only when the reference count hits zero do we join with the oplog visibility background thread. ~WTRecordStore()'s call to WiredTigerKVEngine::haltOplogManager() does not guarantee that the background oplog visibility thread has actually stopped. If there are other RecordStores with oplog-like namespaces, the oplog visibility thread will continue running. Unfortunately, this thread may hold a pointer to the WiredTigerRecordStore being destroyed, which means that in very rare circumstances, the background thread will read from freed memory. If I am right, these events should trigger the bug: 4) Destroy WiredTigerRecordStore for local.oplog.a
As far as I can tell, the order in which we destroy in-memory state about collections on shutdown is not specified/guaranteed (based on my reading of this, which is just a loop over an unordered_map). My guess is that if there are multiple oplog-like collections, and the first one we destroy happens to be the one registered with the oplog visibility thread, there is a brief window during which the visibility thread can read unowned memory. |
| Comments |
| Comment by Githook User [ 27/Jul/21 ] |
|
Author: {'name': 'Dianna Hohensee', 'email': 'dianna.hohensee@mongodb.com', 'username': 'DiannaHohensee'}Message: |
| Comment by Connie Chen [ 23/Jul/20 ] |
|
We're just going to pull out multiple oplog support |