Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-91869

Lock order inversion between ConnectionPool::_mutex and PinnedConnectionTaskExecutor::_mutex

    • Type: Icon: Bug Bug
    • Resolution: Fixed
    • Priority: Icon: Major - P3 Major - P3
    • 8.1.0-rc0
    • Affects Version/s: None
    • Component/s: None
    • None
    • Server Programmability
    • Fully Compatible
    • ALL
    • Programmability 2024-08-05, Programmability 2024-08-19, Programmability 2024-09-02, Programmability 2024-09-16, Programmability 2024-09-30, Programmability 2024-10-14

      PinnedConnectionTaskExecutor::_doNetworking holds the PinnedConnectionTaskExecutor::_mutex while calling _ensureStream, which calls into ConnectionPool::lease and acquires the ConnectionPool::_mutex.

      ConnectionPool::SpecificPool::finishRefresh is called while holding the ConnectionPool::_mutex, and in turn calls ConnectionPool::SpecificPool::processFailure. This function readies promises for pending requests. Readying such a promise can result in a thenRunOn/schedule call on any ExecutorFuture chained to the promise. Because PinnedConnectionTaskExecutor uses a NetworkInterfaceThreadPool for the search subsystem, we can try to run tasks inline with the schedule call when we're on the networking reactor, as we are when running finishRefresh for the connection pool. If one of the scheduled tasks is PinnedConnectionTaskExecutor::_doNetworking, we can end up acquiring the PinnedConnectionTaskExecutor::_mutex while holding the ConnectionPool::_mutex.

      This is a potential deadlock and should be fixed. But, we have never seen the deadlock in the wild, only the inversion found by TSAN.

            Assignee:
            ryan.berryhill@mongodb.com Ryan Berryhill
            Reporter:
            george.wangensteen@mongodb.com George Wangensteen
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated:
              Resolved: