[SERVER-11447] aggregation can sort using index to speed up group of an indexed field Created: 29/Oct/13 Updated: 12/Mar/17 Resolved: 08/Apr/15 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Aggregation Framework |
| Affects Version/s: | 2.4.7, 2.5.3 |
| Fix Version/s: | None |
| Type: | Improvement | Priority: | Major - P3 |
| Reporter: | Asya Kamsky | Assignee: | Unassigned |
| Resolution: | Duplicate | Votes: | 6 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||||||||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||||||||||||||||||
| Participants: | |||||||||||||||||||||||||
| Description |
|
If "$group" is grouping on an indexed field F and if all the functions are not dependent on the rest of the document (such as $sum:1 aka count) huge improvement can be made in performance by adding '{$sort:{F:1'}} before the '{$group}' Tested on large collection (TPCH orders denormalized with lineitems inside) about 1.5 million documents aggregating by order date (2600 different dates) all after warming the data first: Without sort: 18-19 seconds On really small datasets I still see at least 25%-33% improvement with $sort so if we can do that "automatically" that would help performance. |
| Comments |
| Comment by Dan Doyle [ 15/Jan/15 ] |
|
This is a very important issue for our use case as well. We currently do an aggregate to determine a total number of possible unique results and this issue accounts for most of the runtime of any data fetch. |
| Comment by Sylvain Zimmer [ 03/Jan/15 ] |
|
Any news on this issue? Thanks! |