[SERVER-234] Aggregation is very slow compared to full scans Created: 14/Aug/09  Updated: 10/Sep/09  Resolved: 14/Aug/09

Status: Closed
Project: Core Server
Component/s: Performance
Affects Version/s: 0.9.7
Fix Version/s: None

Type: Improvement Priority: Major - P3
Reporter: John Carrino Assignee: Eliot Horowitz (Inactive)
Resolution: Duplicate Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: Text File mongo_log.txt    
Participants:

 Description   

Filtering by a field that isn't indexed is extremely fast, however doing a sum over that same field is very slow when using db.group. I'm sure it has to do with the fact that aggregation takes a function and the javascript server side eval is very slow, but would it be possible to have an aggregation facility that had a similar performance to full scans, which is very fast.

The query I am using here is:
db.group( {ns: "testCollection", key: {}, reduce: function(obj, prev)

{ prev.csum += obj.field1; }

, initial:

{ csum: 0 }

})

Sum isn't the only interesting aggregation to make fast. The most useful case here would be to have a few buckets and as we visit each item, we place one of it's fields in a bucket based to collect a histogram of stats. An example of this would be to count jira bug status codes in one aggregate

{ blocking: 5, major: 3 ... }

.

It just seems to me that if a non-indexed find is fast, than the aggregate case should have similar perf.



 Comments   
Comment by Eliot Horowitz (Inactive) [ 10/Sep/09 ]

closed b/c resolved more than 2 week ago

Comment by Eliot Horowitz (Inactive) [ 14/Aug/09 ]

duplicate of SERVER-189

there are some issues which we fill fix to make this fast.

Comment by John Carrino [ 14/Aug/09 ]

The DB console output from running the above command is attached.

Generated at Thu Feb 08 02:53:28 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.