Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-48279

Race in WiredTigerRecordStore::OplogStones::awaitHasExcessStonesOrDead

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major - P3
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 4.4.0-rc10, 4.7.0
    • Component/s: None
    • Labels:
      None
    • Backwards Compatibility:
      Fully Compatible
    • Operating System:
      ALL
    • Backport Requested:
      v4.4
    • Sprint:
      Execution Team 2020-06-15
    • Linked BF Score:
      14

      Description

      WiredTigerRecordStore::OplogStones uses two mutexes for synchronization. _oplogReclaimMutex (outer) and _mutex (inner).

      Code that notifies the OplogCapMaintainerThread just locks the inner mutex and signal the conditional variable to be woken up. Example here: https://github.com/mongodb/mongo/blob/20de257ec7f9f1def474e7a62375df364ae85f4b/src/mongo/db/storage/wiredtiger/wiredtiger_record_store.cpp#L199-L220

      But there is a window here https://github.com/mongodb/mongo/blob/20de257ec7f9f1def474e7a62375df364ae85f4b/src/mongo/db/storage/wiredtiger/wiredtiger_record_store.cpp#L258-L259 where the OplogCapMaintainerThread has not yet started to wait on the condition variable and the notify call will do nothing. The thread will then wait forever until something else happens that will issue a _pokeReclaimThreadIfNeeded() call. But tests that don't do anything else will eventually timeout.

      To fix I propose that we always take both mutexes (in the same order) to eliminate this window. The outer _oplogReclaimMutex should not be contended so this should be safe to do.

      An alternative solution would be to just wait for a set amount of time and check if any work needs to be done. But that would require taking the inner _mutex unnecessarily.

        Attachments

          Activity

            People

            Assignee:
            gregory.wlodarek Gregory Wlodarek
            Reporter:
            henrik.edin Henrik Edin
            Participants:
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Dates

              Created:
              Updated:
              Resolved: