Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-20058

mongos deadlock while replacing catalog manager

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major - P3
    • Resolution: Fixed
    • Affects Version/s: 3.1.7
    • Fix Version/s: 3.1.7
    • Component/s: Sharding
    • Labels:
      None

      Description

      The important stack trace from the hang analyzer is below. The thing to notice is the reentrancy to the catalog manager. Inside a catalog manager call, ShardConnection goes to refresh sharding metadata via the forwarding catalog manager. If the process detects that it needs to change the catalog manager in the inner operation, it fails to drop the lock on the outer operation, and so waits forever for the catalog manager to get changed out.

        mongo::ForwardingCatalogManager::waitForCatalogManagerChange() ()
        mongo::ForwardingCatalogManager::getAllShards(std::vector<mongo::ShardType, std::allocator<mongo::ShardType> >*) ()
        mongo::ShardRegistry::reload() ()
        mongo::ShardRegistry::getShard(std::string const&)
       
        mongo::(anonymous namespace)::checkShardVersion(mongo::OperationContext*, mongo::DBClientBase*, std::string const&, std::shared_ptr<mongo::ChunkManager>, bool, int) ()
        mongo::VersionManager::checkShardVersionCB(mongo::OperationContext*, mongo::ShardConnection*, bool, int) ()
        mongo::ShardConnection::_finishInit() ()
        mongo::ShardConnection::get() ()
        mongo::DBClientMultiCommand::sendAll() ()
        mongo::ConfigCoordinator::executeBatch(mongo::BatchedCommandRequest const&, mongo::BatchedCommandResponse*) ()
        mongo::CatalogManagerLegacy::writeConfigServerDirect(mongo::BatchedCommandRequest const&, mongo::BatchedCommandResponse*) ()
        mongo::ForwardingCatalogManager::writeConfigServerDirect(mongo::BatchedCommandRequest const&, mongo::BatchedCommandResponse*) ()
        mongo::CatalogManager::update(std::string const&, mongo::BSONObj const&, mongo::BSONObj const&, bool, bool, mongo::BatchedCommandResponse*) ()
        mongo::Balancer::_ping(mongo::OperationContext*, bool) ()
        mongo::Balancer::run() ()                                                            
        mongo::BackgroundJob::jobBody() ()
      

        Attachments

          Activity

            People

            • Votes:
              0 Vote for this issue
              Watchers:
              47 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: