Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-20058

mongos deadlock while replacing catalog manager

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major - P3
    • Resolution: Fixed
    • Affects Version/s: 3.1.7
    • Fix Version/s: 3.1.7
    • Component/s: Sharding
    • Labels:
      None
    • Backwards Compatibility:
      Fully Compatible
    • Operating System:
      ALL
    • Sprint:
      Sharding 8 08/28/15

      Description

      The important stack trace from the hang analyzer is below. The thing to notice is the reentrancy to the catalog manager. Inside a catalog manager call, ShardConnection goes to refresh sharding metadata via the forwarding catalog manager. If the process detects that it needs to change the catalog manager in the inner operation, it fails to drop the lock on the outer operation, and so waits forever for the catalog manager to get changed out.

        mongo::ForwardingCatalogManager::waitForCatalogManagerChange() ()
        mongo::ForwardingCatalogManager::getAllShards(std::vector<mongo::ShardType, std::allocator<mongo::ShardType> >*) ()
        mongo::ShardRegistry::reload() ()
        mongo::ShardRegistry::getShard(std::string const&)
       
        mongo::(anonymous namespace)::checkShardVersion(mongo::OperationContext*, mongo::DBClientBase*, std::string const&, std::shared_ptr<mongo::ChunkManager>, bool, int) ()
        mongo::VersionManager::checkShardVersionCB(mongo::OperationContext*, mongo::ShardConnection*, bool, int) ()
        mongo::ShardConnection::_finishInit() ()
        mongo::ShardConnection::get() ()
        mongo::DBClientMultiCommand::sendAll() ()
        mongo::ConfigCoordinator::executeBatch(mongo::BatchedCommandRequest const&, mongo::BatchedCommandResponse*) ()
        mongo::CatalogManagerLegacy::writeConfigServerDirect(mongo::BatchedCommandRequest const&, mongo::BatchedCommandResponse*) ()
        mongo::ForwardingCatalogManager::writeConfigServerDirect(mongo::BatchedCommandRequest const&, mongo::BatchedCommandResponse*) ()
        mongo::CatalogManager::update(std::string const&, mongo::BSONObj const&, mongo::BSONObj const&, bool, bool, mongo::BatchedCommandResponse*) ()
        mongo::Balancer::_ping(mongo::OperationContext*, bool) ()
        mongo::Balancer::run() ()                                                            
        mongo::BackgroundJob::jobBody() ()
      

        Attachments

          Activity

            People

            Assignee:
            kaloian.manassiev Kaloian Manassiev
            Reporter:
            schwerin Andy Schwerin
            Participants:
            Votes:
            0 Vote for this issue
            Watchers:
            47 Start watching this issue

              Dates

              Created:
              Updated:
              Resolved: