Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-33660

Once getMores include lsid, sharded aggregations with $mergeCursors can hang

    • Type: Icon: Bug Bug
    • Resolution: Fixed
    • Priority: Icon: Major - P3 Major - P3
    • 3.7.3
    • Affects Version/s: None
    • Component/s: Aggregation Framework
    • Labels:
      None
    • Fully Compatible
    • ALL
    • Query 2018-03-12

      A deadlock is induced on the SessionCatalog:

      1. The operation performing the merging half of the pipeline checks out the Session for that lsid.
      2. That operation includes a $mergeCursors, which performs getMores on remote hosts, one of which is the same host performing the $mergeCursors.
      3. That operation will attempt to check out the same session once the getMore includes the lsid - blocking on a mutex in the SessionCatalog.

      As a short term fix, we should do the following:

      1. Only check out the Session if the operation includes a transaction number.
      2. Ban aggregations with a transaction number on mongos.

      As a long term fix, we will investigate either not using getMores over the network for what are really local reads. If that proves difficult, we will have to re-evaluate.

            Assignee:
            charlie.swanson@mongodb.com Charlie Swanson
            Reporter:
            charlie.swanson@mongodb.com Charlie Swanson
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

              Created:
              Updated:
              Resolved: