Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-46664

runCmdOnPrimaryAndAwaitResponse() should not run DBDirect client command with the rstl lock held.

    XMLWordPrintableJSON

Details

    • Icon: Bug Bug
    • Resolution: Fixed
    • Icon: Major - P3 Major - P3
    • 4.4.0-rc0, 4.7.0
    • None
    • Storage
    • None
    • Storage Execution
    • Fully Compatible
    • ALL
    • v4.4
    • Execution Team 2020-03-09
    • 50

    Description

      Currently runCmdOnPrimaryAndAwaitResponse() takes RSTL lock in IX mode and performs DBDirectClient command using AlternativeClientRegion. If the DBDirectClient command takes RSTL lock in the AlternativeClientRegion's opCtx, then it can lead to deadlock involving the stepdown thread and the thread that calls runCmdOnPrimaryAndAwaitResponse().
      The reason for the deadlock is that, the stepdown thread can't acquire RSTL lock as the caller's original opCtx (including locks) has been stashed and replaced with a new opCtx by runCmdOnPrimaryAndAwaitResponse() using AlternativeClientRegion class. So, no way stepdown can interrupt and make the original opCtx to release the locks. As a result, StepDown thread gets blocked behind runCmdOnPrimaryAndAwaitResponse() RSTL lock due to lock conflict . DBDirectClient command gets blocked behind stepdown as stepdown has enqueued the RSTL lock in X mode. And, runCmdOnPrimaryAndAwaitResponse() will be waiting for the DBDirectClient command's response.

      Attachments

        Activity

          People

            backlog-server-execution Backlog - Storage Execution Team
            suganthi.mani@mongodb.com Suganthi Mani
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: