-
Type:
Bug
-
Resolution: Fixed
-
Priority:
Critical - P2
-
Affects Version/s: 8.0.13
-
Component/s: None
-
Query Optimization
-
Fully Compatible
-
ALL
-
v8.2, v8.0
-
None
-
None
-
None
-
None
-
None
-
None
-
None
SUMMARY
On MongoDB Server v8.0+ it is possible to get incorrect results for $sort + $group with $bottom or $top accumulators queries in some cases when a distinct optimization is incorrectly applied. The error is deterministic.
ISSUE DESCRIPTION AND IMPACT
The issue affects $sort + $group aggregation queries with $bottom/$top accumulators eligible for distinct scan optimization in the presence of an index satisfying the sorting pattern of the $sort stage.
For example:
Index: { a: 1, b: 1 }
Aggregation pipeline:
[
{ $sort: { a: 1, b: 1 } },
{
$group: {
_id: "$a",
max_b: {
$top: {
output: "$b",
sortBy: { b: -1 }
}
}
}
}
]
With input documents:
[ { a: 'a', b: 1 }, { a: 'a', b: 2 } ]
Expected output:
{ _id: 'a', max_b: 2 }
Actual (incorrect) output:
{ _id: 'a', max_b: 1 }
A problem occurs when your application runs an aggregate query where ALL of the following are true:
- A $sort stage is immediately followed by a $group stage
- The $group stage contains $top or $bottom accumulators
- The query uses a compound index that satisfies the pattern in the $sort stage
- The group key ('_id' in $group) is a single field and is the prefix of the index
- The accumulator's sortBy is referencing a field that immediately follows the leading index field, and does not match the sort direction in $sort
The behavior is deterministic: the same query over the same data will consistently return the same incorrect result.
DIAGNOSIS AND AFFECTED VERSIONS
Users running queries as described above prior to MongoDB 8.0.20, 8.2.4 may have been affected. Versions 8.2.x are only affected if the featureFlagShardFilteringDistinctScan is changed from the default value of 'true' to 'false'.
To confirm if the issue currently impacts a particular query, users can
Validate that their query matches all the criteria described above
Then run a slightly modified query and compare its output with the output of the original query
For example, the query above can be rewritten by replacing $top with a $topN accumulator, and adding the "n" parameter with the limit value greater than 1, as shown below.
Index: { a: 1, b: 1 }
Aggregation pipeline:
[
{ $sort: { a: 1, b: 1 } },
{
$group: {
_id: "$a",
max_b: {
$topN: {
output: "$b",
sortBy: { b: -1 }
n: 2
}
}
}
}
]
If the top document in the output doesn't match with the output of the original query, then the original query is impacted.
REMEDIATION AND WORKAROUNDS
In some cases when the $group’s aggregation function specifies the opposite to the $sort’s ordering it is possible to fix by replacing $bottom by $top function and vice versa, and by fixing the sort order within the aggregation function, e.g. for the query below
{$sort: {a: 1, b: 1, c: 1}},
{$group: {
_id: "$a",
d: { $top: {
output: "$b",
sortBy: {b: -1, c: -1}}
}
}
}
The following workaround is possible:
{$sort: {a: 1, b: 1, c: 1}},
{$group: {
_id: "$a",
d: { $bottom: {
output: "$b",
sortBy: {b: 1, c: 1}}
}
}
}
The only available workaround in other instances is to either hide the index utilized by the $sort and $group stages or to rewrite the query to prevent that index from being used. Implementing this workaround, however, may significantly impact performance.
Original description
I have a collection with elements of the following structure:
{ device: "M57906", date: some date }
and one index { device: 1, date: 1 }
The query with $bottom does not return the correct result.
This only happens when the index exists.
h4s> db.events.aggregate([{ $sort: { device: 1, date: 1 } }, { $group: { _id: "$device", obj: { $bottom: { output: "$date", sortBy: { date: -1 } } } } }])
[ \{ _id: 'M57906', obj: ISODate('2025-08-28T09:46:33.017Z') } ]
h4s> db.events.aggregate([{ $sort: { device: 1, date: 1 } }, { $group: { _id: "$device", obj: { $bottomN: { n: 1, output: "$date", sortBy: { date: -1 } } } } }])
[ \{ _id: 'M57906', obj: [ ISODate('2025-07-06T00:00:01.305Z') ] } ]
winning plan for $bottom query:
isCached: false,
stage: 'PROJECTION_COVERED',
transformBy: \{ date: 1, device: 1, _id: 0 },
inputStage: {
stage: 'DISTINCT_SCAN',
keyPattern: \{ device: 1, date: 1 },
indexName: 'device_1_date_1',
isMultiKey: false,
multiKeyPaths: \{ device: [], date: [] },
isUnique: false,
isSparse: false,
isPartial: false,
indexVersion: 2,
direction: 'backward',
indexBounds: {
device: [ '[MaxKey, MinKey]' ],
date: [ '[MaxKey, MinKey]' ]
}
}
winning plan for $bottomN: n: 1 query
{
isCached: false,
queryPlan: {
stage: 'GROUP',
planNodeId: 3,
inputStage: {
stage: 'PROJECTION_COVERED',
planNodeId: 2,
transformBy: \{ date: true, device: true, _id: false },
inputStage: {
stage: 'IXSCAN',
planNodeId: 1,
keyPattern: \{ device: 1, date: 1 },
indexName: 'device_1_date_1',
isMultiKey: false,
multiKeyPaths: \{ device: [], date: [] },
isUnique: false,
isSparse: false,
isPartial: false,
indexVersion: 2,
direction: 'forward',
indexBounds: {
device: [ '[MinKey, MaxKey]' ],
date: [ '[MinKey, MaxKey]' ]
}
}
}
},
slotBasedPlan: {
slots: '$$RESULT=s9 env: \{ s4 = SortSpec({"date" : -1}) }',
stages: '[3] project [s9 = newBsonObj("_id", s6, "obj", s8)] \n' +
'[3] project [s8 = aggBottomNFinalize(s7, s4)] \n' +
'[3] group [s6] [s7 = aggBottomN(s4, null, getSortKeyDesc(s2), (s2 ?: null)) init\{[[], 0ll, 1ll, 0, 104857600, true]}] spillSlots[s5] mergingExprs[aggBottomNMerge(s5, s4)] \n' +
'[3] project [s6 = (s1 ?: null)] \n' +
'[1] ixseek KS(0A0A0104) KS(F0F0FE04) none s3 none none lowPriority [s1 = 0, s2 = 1] @"edc48567-cea3-4782-97fe-7173105114f2" @"device_1_date_1" true '
}
}
$top is also broken in the same way.
- is caused by
-
SERVER-84347 Use DISTINCT_SCAN with $top/$bottom accumulators
-
- Closed
-
- is related to
-
SERVER-90017 Parse firstN/lastN/minN/maxN/topN/bottomN with n = 1 into a corresponding first/last/min/max/top/bottom
-
- Closed
-
- related to
-
SERVER-85213 Rewrite $sort+$group with $first/$last to use $top/$bottom
-
- Backlog
-
-
SERVER-113145 Use distinct scan for $group+$top/$bottom and $sort when sort orders are compatible
-
- Closed
-