Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-48641

Deadlock due to the MigrationDestinationManager waiting for write concern with the session checked-out

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major - P3
    • Resolution: Fixed
    • 4.4.0-rc8
    • 4.4.1, 4.7.0
    • Sharding
    • Fully Compatible
    • ALL
    • v4.4
    • Sharding 2020-07-13, Sharding 2020-07-27
    • 40

    Description

      The MigrationDestinationManager checks-out a session and then proceeds executing the recipient logic while that session is checked-out.

      The execution logic at some point reaches to a call to waitForWriteConcern which runs with the session still checked-out.

      Because the JournalFlusher wait is non-interruptible (and also because SERVER-40081 prohibits waitForWriteConcern while having a session checked-out), this this causes a three-thread deadlock with the replication coordinator:

      • T1: MigrationDestinationManager has a session checked-out and is waiting on waitForWriteConcern, which in turn is blocked on JournalFlusher::waitForJournalFlush
      • T2: The JournalFlusher is waiting on a MODE_IX RSM lock, which is held in MODE_X by ReplCoord-3
      • T3: ReplCoord-3, while holding the RSM lock in MODE_X, is killing sessions by calling invalidateSessionsForStepdown and this is blocked on the session checked-out by T1

      Attachments

        Issue Links

          Activity

            People

              jack.mulrow@mongodb.com Jack Mulrow
              kaloian.manassiev@mongodb.com Kaloian Manassiev
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: