Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-62521

Distributed locks might not be released on definite error when using a DDL coordinator

    XMLWordPrintable

Details

    • Fully Compatible
    • ALL
    • v5.3, v5.2, v5.0
    • Sharding EMEA 2022-02-07, Sharding EMEA 2022-02-21, Sharding EMEA 2022-03-07, Sharding EMEA 2022-03-21

    Description

      There are some implementations of the DDL coordinator (like movePrimary) that are designed to not always make forward progress on retriable errors. Such classes set the _completeOnError flag which will prevent retrying the operation if a retriable error is found.

      The purpose of this task, is to ensure that if a retriable error occurs (such as a stepdown in the config server) in a DDLCoordinator implementation that has the _completeOnError flag set to true, the distributed locks are released. The following scenario is an example:

      • A movePrimary command starts
      • There is a stepdown on the config server when committing

      This will leave the primary node of the primary shard with the distributed lock for the database acquired. This would only affect operations that try to grab the database distributed lock on the config server after the scenario has happened.

      Attachments

        Issue Links

          Activity

            People

              marcos.grillo@mongodb.com Marcos José Grillo Ramirez
              marcos.grillo@mongodb.com Marcos José Grillo Ramirez
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: