Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-47422

Use NamespaceSerializer when taking distributed locks for refineCollectionShardKey and migrations

    • Fully Compatible
    • ALL
    • v4.4
    • Sharding 2020-04-20, Sharding 2020-05-04, Sharding 2020-05-18
    • 35

      Before executing, _configsvrRefineCollectionShardKey takes distributed locks on both a sharded collection's database and full namespace, the latter of which conflicts with the distributed lock a migration takes on a sharded collection namespace. By default, taking a distributed lock times out after 20 seconds and has no fairness policy, so in the presence of many concurrent migrations, a shard key refine can time out waiting for a distributed lock and fail with LockBusy. To handle a similar problem, some config server DDL commands take a NamespaceSerializer lock before taking the distributed lock (e.g. _configsvrDropCollection). We should also be able to use the NamespaceSerializer here to avoid dist lock timeouts.

      If these changes are too invasive, we should instead modify the refineCollectionShardKey concurrency jstests ("random_moveChunk_refine_collection_*.js") to retry refines that fail with a LockBusy error.

            matthew.saltz@mongodb.com Matthew Saltz (Inactive)
            jack.mulrow@mongodb.com Jack Mulrow
            0 Vote for this issue
            3 Start watching this issue