Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-60161

Deadlock between config server stepdown and _configsvrRenameCollectionMetadata command

    XMLWordPrintable

Details

    • Bug
    • Status: Closed
    • Major - P3
    • Resolution: Fixed
    • 5.0.0
    • 5.0.4, 5.1.0-rc0
    • Sharding
    • None
    • Fully Compatible
    • ALL
    • v5.0
    • Sharding EMEA 2021-10-04
    • 135

    Description

      The OperationContext for ShardingCatalogManager::renameShardedMetadata() has a logical session checked out while doing an uninterruptible wait on the _kChunkOpLock. If the _kChunkOpLock is currently held (e.g. from a running _configsvrSetAllowMigrations command), then _configsvrRenameCollectionMetadata will block until the _kChunkOpLock is released. In particular, the _configsvrSetAllowMigrations command will acquire the _kChunkOpLock and then attempt to acquire additional LockManager locks such as the RSTL IX lock. If a stepdown occurs on the primary, then the RstlKillOpThread interrupt the OperationContext running ShardingCatalogManager::renameShardedMetadata(). But the uninterruptible wait means that the no attention is given to the kill status. ReplicationCoordinatorImpl::_stepDownFinish() will then block attempting to check out the logical session to kill it as part of invalidateSessionsForStepdown() while holding the RSTL X lock.

      • _configsvrRenameCollectionMetadata (holding "logical session" resource) -> _kChunkOpLock
      • _configsvrSetAllowMigrations (holding _kChunkOpLock) -> RSTL IX lock
      • Stepdown (holding RSTL X lock) -> acquiring "logical session" resource

      I think the solution here would be to make the _kChunkOpLock and _kZoneOpLock acquisitions interruptible by using the 3-argument constructor for Lock::ExclusiveLock.

      Lock::ExclusiveLock chunkLk(opCtx, opCtx->lockState(), _kChunkOpLock);
      Lock::ExclusiveLock zoneLk(opCtx, opCtx->lockState(), _kZoneOpLock);
      

      Attachments

        Issue Links

          Activity

            People

              jordi.serra-torrens@mongodb.com Jordi Serra Torrens
              max.hirschhorn@mongodb.com Max Hirschhorn
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: