-
Type: Improvement
-
Resolution: Duplicate
-
Priority: Major - P3
-
None
-
Affects Version/s: 2.4.7, 2.5.3
-
Component/s: Aggregation Framework
-
None
-
Fully Compatible
If "$group" is grouping on an indexed field F and if all the functions are not dependent on the rest of the document (such as $sum:1 aka count) huge improvement can be made in performance by adding '{$sort:{F:1'}} before the '{$group}'
Tested on large collection (TPCH orders denormalized with lineitems inside) about 1.5 million documents aggregating by order date (2600 different dates) all after warming the data first:
Without sort: 18-19 seconds
With sort: 2.5-2.6 seconds
On really small datasets I still see at least 25%-33% improvement with $sort so if we can do that "automatically" that would help performance.
- duplicates
-
SERVER-4507 aggregation: optimize $group to take advantage of sorted sequences
- Backlog
- is duplicated by
-
SERVER-14303 Allow aggregation $group operator to use an index
- Closed
-
SERVER-15291 slow '$group' performance
- Closed