$top and $bottom ignore sortBy clause when preceeding $sort stage makes it eligible for DISTINCT_SCAN optimization

XMLWordPrintableJSON

    • Query Optimization
    • Fully Compatible
    • ALL
    • v8.2, v8.0
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      Issue Status as of March 17, 2026

      SUMMARY
      On MongoDB Server v8.0+ it is possible to get incorrect results for $sort + $group with $bottom or $top accumulators queries in some cases when a distinct optimization is incorrectly applied. The error is deterministic.

      ISSUE DESCRIPTION AND IMPACT
      The issue affects $sort + $group aggregation queries with $bottom/$top accumulators eligible for distinct scan optimization in the presence of an index satisfying the sorting pattern of the $sort stage.

      For example:

      Index: { a: 1, b: 1 }
      Aggregation pipeline: 
      [
        { $sort: { a: 1, b: 1 } },
        {
          $group: {
            _id: "$a",
            max_b: {
              $top: {
                output: "$b",
                sortBy: { b: -1 }
              }
            }
          }
        }
      ]
      
      

      With input documents:

      	
      [ { a: 'a', b: 1 }, { a: 'a', b: 2 } ]
      
      

      Expected output:

      { _id: 'a', max_b: 2 }
      
      

      Actual (incorrect) output:

      { _id: 'a', max_b: 1 }
      
      

      A problem occurs when your application runs an aggregate query where ALL of the following are true:

      • A $sort stage is immediately followed by a $group stage
      • The $group stage contains $top or $bottom accumulators
      • The query uses a compound index that satisfies the pattern in the $sort stage
      • The group key ('_id' in $group) is a single field and is the prefix of the index
      • The accumulator's sortBy is referencing a field that immediately follows the leading index field, and does not match the sort direction in $sort

      The behavior is deterministic: the same query over the same data will consistently return the same incorrect result.

      DIAGNOSIS AND AFFECTED VERSIONS
      Users running queries as described above prior to MongoDB 8.0.20, 8.2.4 may have been affected. Versions 8.2.x are only affected if the featureFlagShardFilteringDistinctScan is changed from the default value of 'true' to 'false'.

      To confirm if the issue currently impacts a particular query, users can
      Validate that their query matches all the criteria described above
      Then run a slightly modified query and compare its output with the output of the original query

      For example, the query above can be rewritten by replacing $top with a $topN accumulator, and adding the "n" parameter with the limit value greater than 1, as shown below.

      Index: { a: 1, b: 1 }
      Aggregation pipeline: 
      [
        { $sort: { a: 1, b: 1 } },
        {
          $group: {
            _id: "$a",
            max_b: {
              $topN: {
                output: "$b",
                sortBy: { b: -1 }
                n: 2
              }
            }
          }
        }
      ]
      
      

      If the top document in the output doesn't match with the output of the original query, then the original query is impacted.

      REMEDIATION AND WORKAROUNDS
      In some cases when the $group’s aggregation function specifies the opposite to the $sort’s ordering it is possible to fix by replacing $bottom by $top function and vice versa, and by fixing the sort order within the aggregation function, e.g. for the query below

      {$sort: {a: 1, b: 1, c: 1}},
      {$group: { 
        _id: "$a", 
        d: { $top: {
                output: "$b", 
                sortBy: {b: -1, c: -1}}
            }
        }
      }
      
      

      The following workaround is possible:

      {$sort: {a: 1, b: 1, c: 1}},
      {$group: { 
        _id: "$a", 
        d: { $bottom: {
                output: "$b", 
                sortBy: {b: 1, c: 1}}
            }
        }
      }
      
      

      The only available workaround in other instances is to either hide the index utilized by the $sort and $group stages or to rewrite the query to prevent that index from being used. Implementing this workaround, however, may significantly impact performance.


      Original description

      I have a collection with elements of the following structure:

      { device: "M57906", date: some date }

      and one index { device: 1, date: 1 }

      The query with $bottom does not return the correct result.
      This only happens when the index exists.

      h4s> db.events.aggregate([{ $sort: { device: 1, date: 1 } }, { $group: { _id: "$device", obj: { $bottom: { output: "$date", sortBy: { date: -1 } } } } }])
      [ \{ _id: 'M57906', obj: ISODate('2025-08-28T09:46:33.017Z') } ]
      h4s> db.events.aggregate([{ $sort: { device: 1, date: 1 } }, { $group: { _id: "$device", obj: { $bottomN: { n: 1, output: "$date", sortBy: { date: -1 } } } } }])
      [ \{ _id: 'M57906', obj: [ ISODate('2025-07-06T00:00:01.305Z') ] } ]
      

      winning plan for $bottom query:

                  isCached: false,
                  stage: 'PROJECTION_COVERED',
                  transformBy: \{ date: 1, device: 1, _id: 0 },
                  inputStage: {
                    stage: 'DISTINCT_SCAN',
                    keyPattern: \{ device: 1, date: 1 },
                    indexName: 'device_1_date_1',
                    isMultiKey: false,
                    multiKeyPaths: \{ device: [], date: [] },
                    isUnique: false,
                    isSparse: false,
                    isPartial: false,
                    indexVersion: 2,
                    direction: 'backward',
                    indexBounds: {
                      device: [ '[MaxKey, MinKey]' ],
                      date: [ '[MaxKey, MinKey]' ]
                    }
                  }
      

      winning plan for $bottomN: n: 1 query

      {
        isCached: false,
        queryPlan: {
          stage: 'GROUP',
          planNodeId: 3,
          inputStage: {
            stage: 'PROJECTION_COVERED',
            planNodeId: 2,
            transformBy: \{ date: true, device: true, _id: false },
            inputStage: {
              stage: 'IXSCAN',
              planNodeId: 1,
              keyPattern: \{ device: 1, date: 1 },
              indexName: 'device_1_date_1',
              isMultiKey: false,
              multiKeyPaths: \{ device: [], date: [] },
              isUnique: false,
              isSparse: false,
              isPartial: false,
              indexVersion: 2,
              direction: 'forward',
              indexBounds: {
                device: [ '[MinKey, MaxKey]' ],
                date: [ '[MinKey, MaxKey]' ]
              }
            }
          }
        },
        slotBasedPlan: {
          slots: '$$RESULT=s9 env: \{ s4 = SortSpec({"date" : -1}) }',
          stages: '[3] project [s9 = newBsonObj("_id", s6, "obj", s8)] \n' +
            '[3] project [s8 = aggBottomNFinalize(s7, s4)] \n' +
            '[3] group [s6] [s7 = aggBottomN(s4, null, getSortKeyDesc(s2), (s2 ?: null)) init\{[[], 0ll, 1ll, 0, 104857600, true]}] spillSlots[s5] mergingExprs[aggBottomNMerge(s5, s4)] \n' +
            '[3] project [s6 = (s1 ?: null)] \n' +
            '[1] ixseek KS(0A0A0104) KS(F0F0FE04) none s3 none none lowPriority [s1 = 0, s2 = 1] @"edc48567-cea3-4782-97fe-7173105114f2" @"device_1_date_1" true '
        }
      } 

      $top is also broken in the same way.

            Assignee:
            Alexander Ignatyev
            Reporter:
            Matthias Brehmer
            Votes:
            0 Vote for this issue
            Watchers:
            27 Start watching this issue

              Created:
              Updated:
              Resolved: