Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-47175

Possible shutdown-order deadlock between LogicalSessionsCache and ReplicationCoordinator

    XMLWordPrintableJSON

Details

    • Icon: Bug Bug
    • Resolution: Won't Fix
    • Icon: Major - P3 Major - P3
    • None
    • 4.2.5, 4.0.17
    • Sharding
    • None
    • Sharding
    • ALL

    Description

      There is a possibility for shutdown-order deadlock between the LogicalSessionsCache and the ReplicationCoordinator, which looks like this:

      The LogicalSessionsCache's thread calls into the catalog cache in order to fetch routing info for the config.system.sessions collection.

      The catalog cache has been performing network operations (which convert to local storage engine/disk operations on the config server) under a mutex since the beginning of time. This means that if called at the inopportune moment by the LogicalSessionCache, it could cause its thread to block waiting for the majority snapshot to advance (the call under a mutex doesn't have a relevance here, but the fact that the operations convert to local reads on the config server due to ShardLocal does).

      The LogicalSessionsCache is shut down and joined before the transport layer and all of this happens before the ReplicationCoordinator::shutdown. This means that the replication coordinanator depends on the LogicalSessionCache to shutdown, before it itself shuts down, which is a circular dependency.

      The only thing that holds this deadlock from happening is that the shutdown command happens to first step down the replication coordinator, but this is a bit of a coincidental and lucky occurrence that could be inadvertently broken.

      Attachments

        Activity

          People

            backlog-server-sharding [DO NOT USE] Backlog - Sharding Team
            kaloian.manassiev@mongodb.com Kaloian Manassiev
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: