Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-33660

Once getMores include lsid, sharded aggregations with $mergeCursors can hang

    XMLWordPrintable

Details

    • Bug
    • Status: Closed
    • Major - P3
    • Resolution: Fixed
    • None
    • 3.7.3
    • Aggregation Framework
    • None
    • Fully Compatible
    • ALL
    • Query 2018-03-12

    Description

      A deadlock is induced on the SessionCatalog:

      1. The operation performing the merging half of the pipeline checks out the Session for that lsid.
      2. That operation includes a $mergeCursors, which performs getMores on remote hosts, one of which is the same host performing the $mergeCursors.
      3. That operation will attempt to check out the same session once the getMore includes the lsid - blocking on a mutex in the SessionCatalog.

      As a short term fix, we should do the following:

      1. Only check out the Session if the operation includes a transaction number.
      2. Ban aggregations with a transaction number on mongos.

      As a long term fix, we will investigate either not using getMores over the network for what are really local reads. If that proves difficult, we will have to re-evaluate.

      Attachments

        Issue Links

          Activity

            People

              charlie.swanson@mongodb.com Charlie Swanson
              charlie.swanson@mongodb.com Charlie Swanson
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: