Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-32448

ShardRegistry::reload() does blocking work on NetworkInterface thread

    • Type: Icon: Bug Bug
    • Resolution: Works as Designed
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: 3.4.14, 3.6.4
    • Component/s: Networking
    • Sharding
    • ALL
    • Platforms 2018-01-01, Platforms 2018-01-15

      The ShardRegistry schedules itself to perform _internalReload()s periodically, using the TaskExecutor. These jobs are dispatched to NetworkInterfaceASIO, and will eventually run on its thread.

      From _internalReload(), we call reload(), which tries to getAllShards() from the ShardingCatalogClientImpl. getAllShards() makes a Fetcher instance, which launches networking work, and then waits for it to join(). However, when we run this, we are already on NetworkInterfaceASIO's thread. This breaks the contract that callbacks to NetworkInterfaceASIO may not perform blocking work. Worse, when these calls are issued through the same TaskExecutor, the thread will deadlock.

            Assignee:
            backlog-server-sharding [DO NOT USE] Backlog - Sharding Team
            Reporter:
            samantha.ritter@mongodb.com Samantha Ritter (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

              Created:
              Updated:
              Resolved: