[SERVER-49714] Oplog visibility thread may read from unowned memory when multiple oplog collections present Created: 17/Jul/20  Updated: 29/Oct/23  Resolved: 27/Jul/21

Status: Closed
Project: Core Server
Component/s: Storage
Affects Version/s: 4.5.1
Fix Version/s: 4.4.8

Type: Bug Priority: Major - P3
Reporter: Ian Boros Assignee: Dianna Hohensee (Inactive)
Resolution: Fixed Votes: 0
Labels: techdebt
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
Related
related to SERVER-47885 Have lookupCollectionBy{Namespace,UUI... Closed
Backwards Compatibility: Fully Compatible
Operating System: ALL
Sprint: Execution Team 2021-07-12, Execution Team 2021-07-26, Execution Team 2021-08-09
Participants:
Linked BF Score: 21

 Description   

The WiredTigerKVEngine maintains a counter for how many WiredTigerRecordStores are "oplog-like" (namespace of the form local.oplog.*). When a record store with an oplog-like namespace is created, the counter is bumped. Similarly, when the record store for one of these collections is destroyed, the count is decreased. Only when the reference count hits zero do we join with the oplog visibility background thread.

~WTRecordStore()'s call to WiredTigerKVEngine::haltOplogManager() does not guarantee that the background oplog visibility thread has actually stopped. If there are other RecordStores with oplog-like namespaces, the oplog visibility thread will continue running. Unfortunately, this thread may hold a pointer to the WiredTigerRecordStore being destroyed, which means that in very rare circumstances, the background thread will read from freed memory.

If I am right, these events should trigger the bug:
1) Create collection local.oplog.a
2) <Oplog visibility thread now holds a pointer to the WiredTigerRecordStore for local.oplog.a>
3) Create collection local.oplog.b

4) Destroy WiredTigerRecordStore for local.oplog.a
5) Oplog visibility thread continues running, and dereferences its pointer to the now-destroyed WiredTigerRecordStore

 

As far as I can tell, the order in which we destroy in-memory state about collections on shutdown is not specified/guaranteed (based on my reading of this, which is just a loop over an unordered_map). My guess is that if there are multiple oplog-like collections, and the first one we destroy happens to be the one registered with the oplog visibility thread, there is a brief window during which the visibility thread can read unowned memory.



 Comments   
Comment by Githook User [ 27/Jul/21 ]

Author:

{'name': 'Dianna Hohensee', 'email': 'dianna.hohensee@mongodb.com', 'username': 'DiannaHohensee'}

Message: SERVER-49714 Specially support only a single oplog collection, local.oplog.rs, rather than local.oplog.*
Branch: v4.4
https://github.com/mongodb/mongo/commit/e21a59d53ffb6ebd0fe3194057d4850ff7f6c3f3

Comment by Connie Chen [ 23/Jul/20 ]

We're just going to pull out multiple oplog support

Generated at Thu Feb 08 05:20:36 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.