Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-59226

Deadlock when stepping down with a profile session marked as uninterruptible

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Critical - P2
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 5.0.4, 4.2.18, 5.1.0-rc0, 4.4.11
    • Component/s: None
    • Labels:
      None
    • Backwards Compatibility:
      Fully Compatible
    • Operating System:
      ALL
    • Backport Requested:
      v5.0, v4.4, v4.2, v4.0
    • Sprint:
      Repl 2021-08-23, Repl 2021-09-06
    • Case:

      Description

      When a primary is stepping down, when calling _stepDownFinish it kills all sessions through invalidateSessionsForStepDown -> killSessionsAction -> checkOutSessionForKill. This is after it has grabbed the RSTL via AutoGetRstlForStepUpStepDown.

      In the process of stepping down, if it is trying to kill an already checked out session, there is a potential for deadlock, as it needs to wait until the session is checked back in.

      The session will end up getting interrupted when it tries to grab a lock such as the GlobalLock. However, if the session's opCtx is marked as uninterruptible, then it is possible that the checked out session is waiting on the GlobalLock, while the step down thread (which has the RSTL) is waiting on the checked out session, causing a deadlock.

      This is possible when profiling. In general, it may be possible with other uses of the UninterruptibleLockGuard.

        Attachments

          Issue Links

            Activity

              People

              Assignee:
              wenbin.zhu Wenbin Zhu
              Reporter:
              vishnu.kaushik Vishnu Kaushik
              Participants:
              Votes:
              0 Vote for this issue
              Watchers:
              31 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved: