[SERVER-16388] Aggregation operation not using any fields should use an appropriate index instead of collection scan Created: 02/Dec/14 Updated: 25/Jun/15 Resolved: 08/Apr/15 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Aggregation Framework |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Improvement | Priority: | Minor - P4 |
| Reporter: | Anil Kumar | Assignee: | Unassigned |
| Resolution: | Duplicate | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||||||
| Participants: | |||||||||||||
| Description |
|
This is not an issue but some basic optimization for aggregation on the collection when no fields are required from the document to complete the aggregation. The following operation is performing aggregation without using any field from the document, but still results in the full collection scan:
Ideally, in non-sharded environment, this should be traversing the _id index and on sharded environment it should be traversing the shard-key index. |
| Comments |
| Comment by Asya Kamsky [ 07/Mar/15 ] |
|
Aggregation is not generally going to be faster than count so if count has a performance problem, it seems unlikely that aggregate would perform better. |
| Comment by Christopher Price [ 07/Mar/15 ] |
|
Our application allows customers to search for documents in their database and we supply pagination information regarding the results. (Displaying records 1-100 out of 2147654) We have noticed performance issues in some count operations and we were recommended to switch to aggregation. |
| Comment by Asya Kamsky [ 07/Mar/15 ] |
|
You would need to describe your use case - it's not clear to me why you are using aggregation to do a count in your case. |
| Comment by Christopher Price [ 06/Mar/15 ] |
|
I will reiterate my earlier statement: "It would be preferable to have a consistent interface for this operation." |
| Comment by Asya Kamsky [ 06/Mar/15 ] |
|
A possible work-around until this is implemented would be to add {$sort:{_id:1}} as the first stage to the aggregation - that will use the index on _id. |
| Comment by Christopher Price [ 05/Mar/15 ] |
|
As recommended by Mongo we moved all of our count operations to use aggregation instead of simple count(). After doing this we discovered this outlier. So now we have to maintain code that first checks if there is any search_criteria, and if not then we use simple count() otherwise we use aggregation for counting. It would be preferable to have a consistent interface for this operation. |
| Comment by Asya Kamsky [ 03/Dec/14 ] |
|
Is this not just db.collection.count()? Why would it be done via aggregation? |