Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-63169

Fix deadlock in balancer_defragmentation_merge_chunks

    XMLWordPrintableJSON

Details

    • Icon: Bug Bug
    • Resolution: Fixed
    • Icon: Major - P3 Major - P3
    • 5.3.0
    • None
    • None
    • None
    • Fully Compatible
    • ALL
    • 42

    Description

      In order to prevent problems with step down during defragmentation, user cancellation of defragmentation no longer removes the defragmentCollection flag on the collection, but checks whether there is an existing defragmentation state for this collection and then changes the phase to kSplitChunks. This check requires a lock acquisition. If the test has completed phase 1 when defragmentation is cancelled, the balancer thread will be waiting at the failpoint while holding the mutex, causing a deadlock with the cancellation.

      This can be solved by causing the failpoint to not pause, but make transition phases a no-op. Then the deadlock will be solved, and each balancer round will try to transition phases again until the failpoint is cleared.

      Attachments

        Activity

          People

            allison.easton@mongodb.com Allison Easton
            allison.easton@mongodb.com Allison Easton
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: