[SERVER-4656] aggregation: optimize sort/limit combination to only sort top-n items Created: 10/Jan/12 Updated: 07/Apr/23 Resolved: 05/Dec/12 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Aggregation Framework |
| Affects Version/s: | None |
| Fix Version/s: | 2.3.2 |
| Type: | Improvement | Priority: | Major - P3 |
| Reporter: | Daniel Pasette (Inactive) | Assignee: | Mathias Stearn |
| Resolution: | Done | Votes: | 4 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||||||||||
| Participants: | |||||||||||||||||
| Description |
|
If a limit follows a sort, then we can optimize the sort by only retaining the top-n items required to satisfy the request. This may use one of http://www.cplusplus.com/reference/algorithm/partial_sort/ or |
| Comments |
| Comment by auto [ 04/Dec/12 ] |
|
Author: {u'date': u'2012-12-04T15:13:54Z', u'email': u'mathias@10gen.com', u'name': u'Mathias Stearn'}Message: |
| Comment by Eystein Stenberg [ 01/May/12 ] |
|
This looks very promising! , ... ] it should significantly increase the speed. I did a test with 100 collection-objects with 2000 comments each. I want to get the 10 most voted for comments like this: db.coll.aggregate({$project : {comments : 1}}, {$unwind : "$comments"}, {$sort : { "comments.votes" : -1}}, {$limit : 10} ) This takes 5 seconds to complete, and significantly slows down with the input size. You suggestion should make this very fast. It should reduce the input to the final sort from 200 000 to just 1000 in my example (99.5% reduction). Thank you! |