Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-62884

Simplify synchronization semantics of `PrimaryOnlyService`

    • Type: Icon: Improvement Improvement
    • Resolution: Unresolved
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: None
    • Component/s: Internal Code
    • Labels:
      None
    • Service Arch

      The main intention here is to simplify the waiter/notifier pattern around PrimaryOnlyService::_rebuildInstances (defined here). In particular, callers to PrimaryOnlyService::getOrCreateInstance, PrimaryOnlyService::lookupInstance, and PrimaryOnlyService::getAllInstances block on _rebuildCV using the following:

      opCtx->waitForConditionOrInterrupt(_rebuildCV, lk, [this]() { return _state != State::kRebuilding; });
      

      However, PrimaryOnlyService::_rebuildInstances may call notify_all on this condition variable (i.e., _rebuildCV) even if there's a change in term (example):

      ...
      stdx::lock_guard lk(_mutex);
      if (_state != State::kRebuilding || _term != term) {
          _rebuildCV.notify_all();
          return;
      }
      ...
      

      We should simplify/clarify this code and the logic around notifying threads that await completion of PrimaryOnlyService::_rebuildInstances.

      Acceptance criteria: clarify when a thread would block on _rebuildCV, what are the events that would stop this wait, and what's the expected behavior for each observed event. Then, modify the code to align with the findings.

            Assignee:
            backlog-server-servicearch [DO NOT USE] Backlog - Service Architecture
            Reporter:
            amirsaman.memaripour@mongodb.com Amirsaman Memaripour
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated: