Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-53760

$unwind + $sort pipeline produces large number of file handles when spilling to disk

    • Fully Compatible
    • ALL
    • v4.4
    • Query 2021-01-25, Query Execution 2021-04-19, Query Execution 2021-05-03, Query Execution 2021-05-17

      Suppose we have an aggregation which features a $unwind followed by a $sort that spills to disk. $unwind does not produce owned documents, rather, it updates the field being unwound in memory while preserving the backing BSON. The sorter relies on ‘memUsageForSorter()’, (which simply calls 'Document::getApproximateSize()') to compute the current memory usage and determine whether or not we should spill. Crucially, ‘getApproximateSize()’ includes the size of the backing BSON (the document that is input to $unwind), which is massive in size compared to each document produced by $unwind. As a result, this causes the $sort which follows to reach its memory limit quickly and produces an enormous number of file handles when spilling to disk compared to 4.2.

            mohammad.dashti@mongodb.com Mohammad Dashti (Inactive)
            mihai.andrei@mongodb.com Mihai Andrei
            6 Vote for this issue
            20 Start watching this issue