Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-70888

ScopedRangeDeleterLock might lead to a deadlock on stepdown

    XMLWordPrintableJSON

Details

    • Icon: Bug Bug
    • Resolution: Fixed
    • Icon: Major - P3 Major - P3
    • 6.2.0-rc0
    • 6.2.0-rc0
    • Sharding
    • None
    • Fully Compatible
    • ALL
    • Sharding EMEA 2022-10-31
    • 153

    Description

      SERVER-70094 added code to synchronize the range deletion with stepdowns, specifically, it stores the executor of the range deletion thread so it can be joined when stopping the service.

      This have an unintended consequence though, if a stepdown command comes in at a time that manages to grab the RSTL lock before the RangeDeleterService thread does, it will get stuck when trying to stop the service (because it is waiting for the range deleter service executor), when at the same time, the range deleter service thread is actually waiting for the RSTL lock.

      So we have a thread with the RSTL lock held waiting for an executor that will finish only after it grabs the RSTL lock.

      In order to solve this, besides the executor, we could also capture the operation context and cancel it before waiting for the executor.

      Attachments

        Activity

          People

            tommaso.tocci@mongodb.com Tommaso Tocci
            marcos.grillo@mongodb.com Marcos José Grillo Ramirez
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: