[GODRIVER-2194] `CountDocuments` slowly Created: 20/Oct/21  Updated: 27/Oct/23  Resolved: 28/Oct/21

Status: Closed
Project: Go Driver
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Improvement Priority: Major - P3
Reporter: du liu Assignee: Benji Rewis (Inactive)
Resolution: Gone away Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: File count_example.go    
Documentation Changes: Not Needed

 Description   

use `CountDocuments` to get count is slowly than useing mongoshell, By looking log of mongodb, it seems like `CountDocuments` will use `aggregate` to compute count.

mongoshell spend 1660ms, `CountDocuments` spend 2436ms. the filter count is 349326,  the collection count is 19492668, and i have created index.

Are there any way to improve the performance?



 Comments   
Comment by du liu [ 29/Oct/21 ]

mongodb server: 3.2.21

Esha Bhargava (Jira) <jira@mongodb.org> 于2021年10月26日周二 上午3:47写道:

> Esha Bhargava
> <https://jira.mongodb.org/secure/ViewProfile.jspa?name=esha.bhargava>
> updated [image: Improvement] GODRIVER-2194
> <https://jira.mongodb.org/browse/GODRIVER-2194>
>
> Go Driver <https://jira.mongodb.org/browse/GODRIVER> / [image:
> Improvement] <https://jira.mongodb.org/browse/GODRIVER-2194> GODRIVER-2194
> <https://jira.mongodb.org/browse/GODRIVER-2194>
> `CountDocuments` slowly <https://jira.mongodb.org/browse/GODRIVER-2194>
> Change By: Esha Bhargava
> <https://jira.mongodb.org/secure/ViewProfile.jspa?name=esha.bhargava>
> Assignee: Benji Rewis
> Status: Needs Triage Investigating
> [image: Add Comment]
> <https://jira.mongodb.org/browse/GODRIVER-2194#add-comment> Add Comment
> <https://jira.mongodb.org/browse/GODRIVER-2194#add-comment>
>
> This message was sent from MongoDB's issue tracking system. To respond to
> this ticket, please login to jira.mongodb.org using your JIRA, MongoDB
> Cloud Manager, or MongoDB Atlas credentials.
>

Comment by Benji Rewis (Inactive) [ 28/Oct/21 ]

Happy to help (sounds like my suggestions did not do much haha). I'm curious why upgrading to 4.4 made the count operation so much faster. I'll close this ticket as "gone away" for now, but out of curiosity, which MongoDB server version did you upgrade from?

Comment by du liu [ 28/Oct/21 ]

Thanks! I tried it yesterday.(Only faster a litter) Then I give up and upgrade mongo server to 4.4.5.,faster so much!

Thank you!

Comment by Benji Rewis (Inactive) [ 27/Oct/21 ]

An excellent point; Cursor.All will certainly take a lot of memory locally, especially with the amount of documents you have. If you'd like to maintain the performance of db.collection.find({filter}).count(), your best bet may be to manually run a count command using RunCommand. I've attached a file containing example code for how you might do that.

Note that this is exactly what the driver does when running EstimatedDocumentCount against server versions <= 4.9.0.

 

count_example.go

Comment by du liu [ 27/Oct/21 ]

It seems like  `Cursor.All` will be more slowly and will take more memory...

Comment by du liu [ 27/Oct/21 ]

Thanks

Comment by Benji Rewis (Inactive) [ 26/Oct/21 ]

Gotcha; that makes sense.

There is unfortunately no one operation in the Go driver that is a perfect analogue of db.collection.find({filter}).count(). You could run something like

cursor, err := coll.Find(context.TODO(), {my filter})
if err != nil {   
   panic(err)
}
 
var docs []bson.D
err = cursor.All(context.TODO(), &docs)
if err != nil {   
   panic(err)
}
 
count := len(docs)

That code would probably have similar performance to db.collection.find({filter}).count(). db.collection.countDocuments (and its analogue in the Go driver CountDocuments) will return a more accurate count (even after an unclean shutdown or in the presence of orphaned documents on sharded clusters).

Adding an operation in the Go driver equivalent to db.collection.find({filter}).count() (or modifying CountDocuments to use the code above) would require a cross-drivers change. Drivers already agreed to deprecate our version of db.collection.count(), so that change seems unlikely to me. That said, I'm happy to bring it up with the drivers department as a whole.

Comment by du liu [ 26/Oct/21 ]

Thanks for your replay. I want to count by a query filter.So I can not use  EstimatedDocumentCount .

And in the mognoshell, my query like this: db.collection.find({my filter}).count()

Comment by Benji Rewis (Inactive) [ 25/Oct/21 ]

Hello mrliuxiansen8023@gmail.com! Thanks for your improvement suggestion.

CountDocuments does indeed run an aggregate under the hood. The mongoshell's db.collection.countDocuments actually runs the exact same aggregate, so there really shouldn't be a significant difference in performance.

Which mongoshell operation are you running? db.collection.count is a slightly different mongoshell method with better performance that only returns an approximate count. If you only need an approximate account you can always use EstimatedDocumentCount in the Go driver. EstimatedDocumentCount should have much better performance, as it does not perform an entire collection scan.

Generated at Thu Feb 08 08:38:02 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.