[GODRIVER-2194] `CountDocuments` slowly Created: 20/Oct/21 Updated: 27/Oct/23 Resolved: 28/Oct/21 |
|
| Status: | Closed |
| Project: | Go Driver |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Improvement | Priority: | Major - P3 |
| Reporter: | du liu | Assignee: | Benji Rewis (Inactive) |
| Resolution: | Gone away | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Attachments: |
|
| Documentation Changes: | Not Needed |
| Description |
|
use `CountDocuments` to get count is slowly than useing mongoshell, By looking log of mongodb, it seems like `CountDocuments` will use `aggregate` to compute count. mongoshell spend 1660ms, `CountDocuments` spend 2436ms. the filter count is 349326, the collection count is 19492668, and i have created index. Are there any way to improve the performance? |
| Comments |
| Comment by du liu [ 29/Oct/21 ] | ||||||||||||
|
mongodb server: 3.2.21 Esha Bhargava (Jira) <jira@mongodb.org> 于2021年10月26日周二 上午3:47写道: > Esha Bhargava | ||||||||||||
| Comment by Benji Rewis (Inactive) [ 28/Oct/21 ] | ||||||||||||
|
Happy to help (sounds like my suggestions did not do much haha). I'm curious why upgrading to 4.4 made the count operation so much faster. I'll close this ticket as "gone away" for now, but out of curiosity, which MongoDB server version did you upgrade from? | ||||||||||||
| Comment by du liu [ 28/Oct/21 ] | ||||||||||||
|
Thanks! I tried it yesterday.(Only faster a litter Thank you! | ||||||||||||
| Comment by Benji Rewis (Inactive) [ 27/Oct/21 ] | ||||||||||||
|
An excellent point; Cursor.All will certainly take a lot of memory locally, especially with the amount of documents you have. If you'd like to maintain the performance of db.collection.find({filter}).count(), your best bet may be to manually run a count command using RunCommand. I've attached a file containing example code for how you might do that. Note that this is exactly what the driver does when running EstimatedDocumentCount against server versions <= 4.9.0.
| ||||||||||||
| Comment by du liu [ 27/Oct/21 ] | ||||||||||||
|
It seems like `Cursor.All` will be more slowly and will take more memory... | ||||||||||||
| Comment by du liu [ 27/Oct/21 ] | ||||||||||||
|
Thanks | ||||||||||||
| Comment by Benji Rewis (Inactive) [ 26/Oct/21 ] | ||||||||||||
|
Gotcha; that makes sense. There is unfortunately no one operation in the Go driver that is a perfect analogue of db.collection.find({filter}).count(). You could run something like
That code would probably have similar performance to db.collection.find({filter}).count(). db.collection.countDocuments (and its analogue in the Go driver CountDocuments) will return a more accurate count (even after an unclean shutdown or in the presence of orphaned documents on sharded clusters). Adding an operation in the Go driver equivalent to db.collection.find({filter}).count() (or modifying CountDocuments to use the code above) would require a cross-drivers change. Drivers already agreed to deprecate our version of db.collection.count(), so that change seems unlikely to me. That said, I'm happy to bring it up with the drivers department as a whole. | ||||||||||||
| Comment by du liu [ 26/Oct/21 ] | ||||||||||||
|
Thanks for your replay. I want to count by a query filter.So I can not use EstimatedDocumentCount . And in the mognoshell, my query like this: db.collection.find({my filter}).count() | ||||||||||||
| Comment by Benji Rewis (Inactive) [ 25/Oct/21 ] | ||||||||||||
|
Hello mrliuxiansen8023@gmail.com! Thanks for your improvement suggestion. CountDocuments does indeed run an aggregate under the hood. The mongoshell's db.collection.countDocuments actually runs the exact same aggregate, so there really shouldn't be a significant difference in performance. Which mongoshell operation are you running? db.collection.count is a slightly different mongoshell method with better performance that only returns an approximate count. If you only need an approximate account you can always use EstimatedDocumentCount in the Go driver. EstimatedDocumentCount should have much better performance, as it does not perform an entire collection scan. |