Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-53760

$unwind + $sort pipeline produces large number of file handles when spilling to disk

    XMLWordPrintable

    Details

    • Backwards Compatibility:
      Fully Compatible
    • Operating System:
      ALL
    • Backport Requested:
      v4.4
    • Sprint:
      Query 2021-01-25, Query Execution 2021-04-19, Query Execution 2021-05-03, Query Execution 2021-05-17
    • Case:

      Description

      Suppose we have an aggregation which features a $unwind followed by a $sort that spills to disk. $unwind does not produce owned documents, rather, it updates the field being unwound in memory while preserving the backing BSON. The sorter relies on ‘memUsageForSorter()’, (which simply calls 'Document::getApproximateSize()') to compute the current memory usage and determine whether or not we should spill. Crucially, ‘getApproximateSize()’ includes the size of the backing BSON (the document that is input to $unwind), which is massive in size compared to each document produced by $unwind. As a result, this causes the $sort which follows to reach its memory limit quickly and produces an enormous number of file handles when spilling to disk compared to 4.2.

        Attachments

          Activity

            People

            Assignee:
            mohammad.dashti Mohammad Dashti
            Reporter:
            mihai.andrei Mihai Andrei
            Participants:
            Votes:
            6 Vote for this issue
            Watchers:
            20 Start watching this issue

              Dates

              Created:
              Updated:
              Resolved: