[SERVER-32574] Repairing the local database can cause the WT oplog manager thread to permanently exit. Created: 07/Jan/18  Updated: 30/Oct/23  Resolved: 23/Jan/18

Status: Closed
Project: Core Server
Component/s: Storage
Affects Version/s: None
Fix Version/s: 3.6.4, 3.7.2

Type: Bug Priority: Major - P3
Reporter: Daniel Gottlieb (Inactive) Assignee: Geert Bosch
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Backports
Depends
Related
Backwards Compatibility: Fully Compatible
Operating System: ALL
Backport Requested:
v3.6
Sprint: Storage 2018-01-29
Participants:
Linked BF Score: 0

 Description   

The WiredTigerOplogManager's purpose is to advance oplog visibility. Without it, a node can stop responding to requests. The WiredTigerOplogManager will only be constructed once, its lifetime bound by the WiredTigerKVEngine. However the background thread that manages oplog visibility starts and stops as the oplog is created and destroyed. I'm not justifying these details, just stating them,

There was an assumption that the oplog manager would never be restarted in production. That assumption was not true. Repairing the local database will recreate the oplog record store. When the record store is destroyed, the background thread is halted. When the record store is recreated, the thread is restarted. However, restarting has a race. The thread may start and read _shuttingDown to be true before the spawning thread resets it back to false. This will cause the newly spawned thread to incorrectly shutdown.



 Comments   
Comment by Githook User [ 28/Feb/18 ]

Author:

{'email': 'geert@mongodb.com', 'name': 'Geert Bosch', 'username': 'GeertBosch'}

Message: SERVER-32574 Fix oplog thread restart race in local DB repair

(cherry picked from commit 7a800bc2edf646ff0df3fa2bb4975fd05bd41298)
Branch: v3.6
https://github.com/mongodb/mongo/commit/47a2ab3643cc9b3b1f00decfd3e8f3745283affc

Comment by Githook User [ 23/Jan/18 ]

Author:

{'name': 'Geert Bosch', 'email': 'geert@mongodb.com', 'username': 'GeertBosch'}

Message: SERVER-32574 Fix oplog thread restart race in local DB repair
Branch: master
https://github.com/mongodb/mongo/commit/7a800bc2edf646ff0df3fa2bb4975fd05bd41298

Generated at Thu Feb 08 04:30:39 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.