Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-60444

ThreadPool::waitForIdle's header comment should note that it can't be called concurrently with shutdown

    • Type: Icon: Bug Bug
    • Resolution: Unresolved
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: None
    • Component/s: None
    • Service Arch
    • ALL
    • Service Arch 2021-10-18
    • 2

      In SERVER-53477, we made the behavior of ThreadPool::waitForIdle match its documentation by making it safe to call before shutdown. To do so, we have waitForIdle return when the ThreadPool is shutting down, but with the caveat that there is no guarantee that work does or does not remain pending on the ThreadPool when waitForIdle returns due to shutdown. 

       -- 

      When implementing that ticket, I expected that the idle-indicating condition variable would necessarily be notified by the ordinary mechanism  here during shutdown as all worker threads drain tasks from the pending work queue before stopping work here. However, because worker threads do not count themselves as idle when stopping due to shutdown, and therefore decrement ThreadPool::numIdleThreads before stopping, that condition may never obtain. The thread that first sees pendingTasks as empty after completing its task and re-aquiring the lock in ThreadPool::doOneTask can check this condition _after other worker threads have stopped and therefore decremented ThreadPool::_numIdleThreads, and therefore will not see _threads.size() == _numIdleThreads. More importantly, 

       -- 

      We need to notify poolIsIdle separately when the ThreadPool enters state joinRequired to ensure that people who called waitForIdle _before shutdown are woken up on shutdown, regardless of the current status of the pending work queue or the threads that may be helping to drain it. 

       

      update: after discussing with lingzhi.deng and schwerin  we realized providing waitForIdle with a safe-shutdown contract would have required more work and changes to the ThreadPool internals that were somewhat high-risk and unnecessary given the very few callers of waitForIdle. We've decided instead that the new caller of waitForIdle in the TenantOplogApplier should obey the same shutdown-pattern as the existing caller OplogApplier (see SERVER-60782). We'll now use this ticket to change the header comment for waitForIdle to indicate the safe usage of it requires that it is not called concurrently with shut-down or on an already shut-down pool 

            Assignee:
            backlog-server-servicearch [DO NOT USE] Backlog - Service Architecture
            Reporter:
            george.wangensteen@mongodb.com George Wangensteen
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated: