Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-32574

Repairing the local database can cause the WT oplog manager thread to permanently exit.

    • Type: Icon: Bug Bug
    • Resolution: Fixed
    • Priority: Icon: Major - P3 Major - P3
    • 3.6.4, 3.7.2
    • Affects Version/s: None
    • Component/s: Storage
    • Labels:
    • Fully Compatible
    • ALL
    • v3.6
    • Storage 2018-01-29
    • 0

      The WiredTigerOplogManager's purpose is to advance oplog visibility. Without it, a node can stop responding to requests. The WiredTigerOplogManager will only be constructed once, its lifetime bound by the WiredTigerKVEngine. However the background thread that manages oplog visibility starts and stops as the oplog is created and destroyed. I'm not justifying these details, just stating them,

      There was an assumption that the oplog manager would never be restarted in production. That assumption was not true. Repairing the local database will recreate the oplog record store. When the record store is destroyed, the background thread is halted. When the record store is recreated, the thread is restarted. However, restarting has a race. The thread may start and read _shuttingDown to be true before the spawning thread resets it back to false. This will cause the newly spawned thread to incorrectly shutdown.

            geert.bosch@mongodb.com Geert Bosch
            daniel.gottlieb@mongodb.com Daniel Gottlieb (Inactive)
            0 Vote for this issue
            9 Start watching this issue