Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-59226

Deadlock when stepping down with a profile session marked as uninterruptible

    • Type: Icon: Bug Bug
    • Resolution: Fixed
    • Priority: Icon: Critical - P2 Critical - P2
    • 4.4.11, 4.2.18, 5.0.4, 5.1.0-rc0
    • Affects Version/s: None
    • Component/s: None
    • Labels:
    • Fully Compatible
    • ALL
    • v5.0, v4.4, v4.2, v4.0
    • Repl 2021-08-23, Repl 2021-09-06

      When a primary is stepping down, when calling _stepDownFinish it kills all sessions through invalidateSessionsForStepDown -> killSessionsAction -> checkOutSessionForKill. This is after it has grabbed the RSTL via AutoGetRstlForStepUpStepDown.

      In the process of stepping down, if it is trying to kill an already checked out session, there is a potential for deadlock, as it needs to wait until the session is checked back in.

      The session will end up getting interrupted when it tries to grab a lock such as the GlobalLock. However, if the session's opCtx is marked as uninterruptible, then it is possible that the checked out session is waiting on the GlobalLock, while the step down thread (which has the RSTL) is waiting on the checked out session, causing a deadlock.

      This is possible when profiling. In general, it may be possible with other uses of the UninterruptibleLockGuard.

            wenbin.zhu@mongodb.com Wenbin Zhu
            vishnu.kaushik@mongodb.com Vishnu Kaushik
            0 Vote for this issue
            32 Start watching this issue