Details
-
Improvement
-
Resolution: Duplicate
-
Major - P3
-
None
-
0.9.7
-
None
Description
Filtering by a field that isn't indexed is extremely fast, however doing a sum over that same field is very slow when using db.group. I'm sure it has to do with the fact that aggregation takes a function and the javascript server side eval is very slow, but would it be possible to have an aggregation facility that had a similar performance to full scans, which is very fast.
The query I am using here is:
db.group( {ns: "testCollection", key: {}, reduce: function(obj, prev)
, initial:
{ csum: 0 }})
Sum isn't the only interesting aggregation to make fast. The most useful case here would be to have a few buckets and as we visit each item, we place one of it's fields in a bucket based to collect a histogram of stats. An example of this would be to count jira bug status codes in one aggregate
{ blocking: 5, major: 3 ... }.
It just seems to me that if a non-indexed find is fast, than the aggregate case should have similar perf.