[SERVER-34453] aggregation $count underperforms count() Created: 13/Apr/18  Updated: 27/Oct/23  Resolved: 25/May/18

Status: Closed
Project: Core Server
Component/s: Aggregation Framework
Affects Version/s: 3.7.3
Fix Version/s: None

Type: Bug Priority: Minor - P4
Reporter: tony kerz Assignee: William Byrne III
Resolution: Works as Designed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Operating System: ALL
Participants:

 Description   

i have a collection with 10m rows.

this returns in like 10ms:

db.getCollection('people').count()

while this returns like 10s:

db.getCollection('people').aggregate([{$count: 'count'}])

doesn't seem to be related to https://jira.mongodb.org/browse/SERVER-7568, but similar in marked performance diff doing similar ops in aggregation and non-aggregation styles.



 Comments   
Comment by William Byrne III [ 01/May/18 ]

Hi Tony,

The reason for these performance differences is that a count() without a predicate doesn't fetch the matching documents and actually count them, it uses the metadata for the collection to return the number count of documents that would match an equivalent find() query. In contrast, the $count aggregation stage iterates through the documents passed to it from the prior stage (or from the collection/index scan if it is the first stage) and counts them.

The equivalent non-aggregation command for $count is actually itcount, as it also iterates through the matching documents. You will find $count and itcount() have similar performance.

Note also that while just reading the collection metadata as count() does is faster, it can return inaccurate counts for sharded clusters if there are orphaned documents or ongoing chunk migrations. Both $count and itcount() queries pull the documents to be counted through the SHARDING_FILTER, which eliminates duplicated documents from failed and ongoing migrations, so their results are accurate.

Regards,

William Byrne III

Generated at Thu Feb 08 04:36:44 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.