-
Type: Bug
-
Resolution: Unresolved
-
Priority: Major - P3
-
None
-
Affects Version/s: None
-
Component/s: None
-
Server Programmability
-
ALL
-
Service Arch 2021-10-18
-
2
In SERVER-53477, we made the behavior of ThreadPool::waitForIdle match its documentation by making it safe to call before shutdown. To do so, we have waitForIdle return when the ThreadPool is shutting down, but with the caveat that there is no guarantee that work does or does not remain pending on the ThreadPool when waitForIdle returns due to shutdown.
--
When implementing that ticket, I expected that the idle-indicating condition variable would necessarily be notified by the ordinary mechanism here during shutdown as all worker threads drain tasks from the pending work queue before stopping work here. However, because worker threads do not count themselves as idle when stopping due to shutdown, and therefore decrement ThreadPool::numIdleThreads before stopping, that condition may never obtain. The thread that first sees pendingTasks as empty after completing its task and re-aquiring the lock in ThreadPool::doOneTask can check this condition _after other worker threads have stopped and therefore decremented ThreadPool::_numIdleThreads, and therefore will not see _threads.size() == _numIdleThreads. More importantly,
--
We need to notify poolIsIdle separately when the ThreadPool enters state joinRequired to ensure that people who called waitForIdle _before shutdown are woken up on shutdown, regardless of the current status of the pending work queue or the threads that may be helping to drain it.
update: after discussing with lingzhi.deng and schwerin we realized providing waitForIdle with a safe-shutdown contract would have required more work and changes to the ThreadPool internals that were somewhat high-risk and unnecessary given the very few callers of waitForIdle. We've decided instead that the new caller of waitForIdle in the TenantOplogApplier should obey the same shutdown-pattern as the existing caller OplogApplier (see SERVER-60782). We'll now use this ticket to change the header comment for waitForIdle to indicate the safe usage of it requires that it is not called concurrently with shut-down or on an already shut-down pool
- is related to
-
SERVER-60782 ThreadPool::waitForIdle should not be called concurrently with shutdown in TenantOplogApplier shutdown
- Closed