Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-20058

mongos deadlock while replacing catalog manager

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major - P3
    • Resolution: Fixed
    • 3.1.7
    • 3.1.7
    • Sharding
    • None
    • Fully Compatible
    • ALL
    • Sharding 8 08/28/15

    Description

      The important stack trace from the hang analyzer is below. The thing to notice is the reentrancy to the catalog manager. Inside a catalog manager call, ShardConnection goes to refresh sharding metadata via the forwarding catalog manager. If the process detects that it needs to change the catalog manager in the inner operation, it fails to drop the lock on the outer operation, and so waits forever for the catalog manager to get changed out.

        mongo::ForwardingCatalogManager::waitForCatalogManagerChange() ()
        mongo::ForwardingCatalogManager::getAllShards(std::vector<mongo::ShardType, std::allocator<mongo::ShardType> >*) ()
        mongo::ShardRegistry::reload() ()
        mongo::ShardRegistry::getShard(std::string const&)
       
        mongo::(anonymous namespace)::checkShardVersion(mongo::OperationContext*, mongo::DBClientBase*, std::string const&, std::shared_ptr<mongo::ChunkManager>, bool, int) ()
        mongo::VersionManager::checkShardVersionCB(mongo::OperationContext*, mongo::ShardConnection*, bool, int) ()
        mongo::ShardConnection::_finishInit() ()
        mongo::ShardConnection::get() ()
        mongo::DBClientMultiCommand::sendAll() ()
        mongo::ConfigCoordinator::executeBatch(mongo::BatchedCommandRequest const&, mongo::BatchedCommandResponse*) ()
        mongo::CatalogManagerLegacy::writeConfigServerDirect(mongo::BatchedCommandRequest const&, mongo::BatchedCommandResponse*) ()
        mongo::ForwardingCatalogManager::writeConfigServerDirect(mongo::BatchedCommandRequest const&, mongo::BatchedCommandResponse*) ()
        mongo::CatalogManager::update(std::string const&, mongo::BSONObj const&, mongo::BSONObj const&, bool, bool, mongo::BatchedCommandResponse*) ()
        mongo::Balancer::_ping(mongo::OperationContext*, bool) ()
        mongo::Balancer::run() ()                                                            
        mongo::BackgroundJob::jobBody() ()
      

      Attachments

        Activity

          People

            kaloian.manassiev@mongodb.com Kaloian Manassiev
            schwerin@mongodb.com Andy Schwerin
            Votes:
            0 Vote for this issue
            Watchers:
            47 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: