Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-62884

Simplify synchronization semantics of `PrimaryOnlyService`

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Major - P3
    • Resolution: Unresolved
    • None
    • None
    • Internal Code
    • None
    • Service Arch

    Description

      The main intention here is to simplify the waiter/notifier pattern around PrimaryOnlyService::_rebuildInstances (defined here). In particular, callers to PrimaryOnlyService::getOrCreateInstance, PrimaryOnlyService::lookupInstance, and PrimaryOnlyService::getAllInstances block on _rebuildCV using the following:

      opCtx->waitForConditionOrInterrupt(_rebuildCV, lk, [this]() { return _state != State::kRebuilding; });
      

      However, PrimaryOnlyService::_rebuildInstances may call notify_all on this condition variable (i.e., _rebuildCV) even if there's a change in term (example):

      ...
      stdx::lock_guard lk(_mutex);
      if (_state != State::kRebuilding || _term != term) {
          _rebuildCV.notify_all();
          return;
      }
      ...
      

      We should simplify/clarify this code and the logic around notifying threads that await completion of PrimaryOnlyService::_rebuildInstances.

      Acceptance criteria: clarify when a thread would block on _rebuildCV, what are the events that would stop this wait, and what's the expected behavior for each observed event. Then, modify the code to align with the findings.

      Attachments

        Issue Links

          Activity

            People

              backlog-server-servicearch Backlog - Service Architecture
              amirsaman.memaripour@mongodb.com Amirsaman Memaripour
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated: