Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-32448

ShardRegistry::reload() does blocking work on NetworkInterface thread

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major - P3
    • Resolution: Works as Designed
    • Affects Version/s: 3.4.14, 3.6.4
    • Fix Version/s: None
    • Component/s: Networking
    • Labels:
    • Operating System:
      ALL
    • Sprint:
      Platforms 2018-01-01, Platforms 2018-01-15

      Description

      The ShardRegistry schedules itself to perform _internalReload()s periodically, using the TaskExecutor. These jobs are dispatched to NetworkInterfaceASIO, and will eventually run on its thread.

      From _internalReload(), we call reload(), which tries to getAllShards() from the ShardingCatalogClientImpl. getAllShards() makes a Fetcher instance, which launches networking work, and then waits for it to join(). However, when we run this, we are already on NetworkInterfaceASIO's thread. This breaks the contract that callbacks to NetworkInterfaceASIO may not perform blocking work. Worse, when these calls are issued through the same TaskExecutor, the thread will deadlock.

        Attachments

          Issue Links

            Activity

              People

              • Votes:
                0 Vote for this issue
                Watchers:
                7 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: