[SERVER-234] Aggregation is very slow compared to full scans Created: 14/Aug/09 Updated: 10/Sep/09 Resolved: 14/Aug/09 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Performance |
| Affects Version/s: | 0.9.7 |
| Fix Version/s: | None |
| Type: | Improvement | Priority: | Major - P3 |
| Reporter: | John Carrino | Assignee: | Eliot Horowitz (Inactive) |
| Resolution: | Duplicate | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Attachments: |
|
| Participants: |
| Description |
|
Filtering by a field that isn't indexed is extremely fast, however doing a sum over that same field is very slow when using db.group. I'm sure it has to do with the fact that aggregation takes a function and the javascript server side eval is very slow, but would it be possible to have an aggregation facility that had a similar performance to full scans, which is very fast. The query I am using here is: , initial: { csum: 0 }}) Sum isn't the only interesting aggregation to make fast. The most useful case here would be to have a few buckets and as we visit each item, we place one of it's fields in a bucket based to collect a histogram of stats. An example of this would be to count jira bug status codes in one aggregate { blocking: 5, major: 3 ... }. It just seems to me that if a non-indexed find is fast, than the aggregate case should have similar perf. |
| Comments |
| Comment by Eliot Horowitz (Inactive) [ 10/Sep/09 ] |
|
closed b/c resolved more than 2 week ago |
| Comment by Eliot Horowitz (Inactive) [ 14/Aug/09 ] |
|
duplicate of there are some issues which we fill fix to make this fast. |
| Comment by John Carrino [ 14/Aug/09 ] |
|
The DB console output from running the above command is attached. |