Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-48641

Deadlock due to the MigrationDestinationManager waiting for write concern with the session checked-out

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major - P3
    • Resolution: Fixed
    • Affects Version/s: 4.4.0-rc8
    • Fix Version/s: 4.5.1
    • Component/s: Sharding
    • Labels:
      None
    • Backwards Compatibility:
      Fully Compatible
    • Operating System:
      ALL
    • Backport Requested:
      v4.4
    • Sprint:
      Sharding 2020-07-13, Sharding 2020-07-27
    • Linked BF Score:
      40

      Description

      The MigrationDestinationManager checks-out a session and then proceeds executing the recipient logic while that session is checked-out.

      The execution logic at some point reaches to a call to waitForWriteConcern which runs with the session still checked-out.

      Because the JournalFlusher wait is non-interruptible (and also because SERVER-40081 prohibits waitForWriteConcern while having a session checked-out), this this causes a three-thread deadlock with the replication coordinator:

      • T1: MigrationDestinationManager has a session checked-out and is waiting on waitForWriteConcern, which in turn is blocked on JournalFlusher::waitForJournalFlush
      • T2: The JournalFlusher is waiting on a MODE_IX RSM lock, which is held in MODE_X by ReplCoord-3
      • T3: ReplCoord-3, while holding the RSM lock in MODE_X, is killing sessions by calling invalidateSessionsForStepdown and this is blocked on the session checked-out by T1

        Attachments

          Issue Links

            Activity

              People

              Assignee:
              jack.mulrow Jack Mulrow
              Reporter:
              kaloian.manassiev Kaloian Manassiev
              Participants:
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved: