Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-62521

Distributed locks might not be released on definite error when using a DDL coordinator

    • Fully Compatible
    • ALL
    • v5.3, v5.2, v5.0
    • Sharding EMEA 2022-02-07, Sharding EMEA 2022-02-21, Sharding EMEA 2022-03-07, Sharding EMEA 2022-03-21

      There are some implementations of the DDL coordinator (like movePrimary) that are designed to not always make forward progress on retriable errors. Such classes set the _completeOnError flag which will prevent retrying the operation if a retriable error is found.

      The purpose of this task, is to ensure that if a retriable error occurs (such as a stepdown in the config server) in a DDLCoordinator implementation that has the _completeOnError flag set to true, the distributed locks are released. The following scenario is an example:

      • A movePrimary command starts
      • There is a stepdown on the config server when committing

      This will leave the primary node of the primary shard with the distributed lock for the database acquired. This would only affect operations that try to grab the database distributed lock on the config server after the scenario has happened.

            marcos.grillo@mongodb.com Marcos José Grillo Ramirez
            marcos.grillo@mongodb.com Marcos José Grillo Ramirez
            0 Vote for this issue
            4 Start watching this issue