[SERVER-57518] 16% performance loss switching from Count to CountDocuments Created: 08/Jun/21 Updated: 31/Oct/23 |
|
| Status: | Backlog |
| Project: | Core Server |
| Component/s: | None |
| Affects Version/s: | 4.4.5, 4.0.25 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Jochem Bonarius | Assignee: | Backlog - Query Optimization |
| Resolution: | Unresolved | Votes: | 4 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||||||||||||||||||||||||||||||||||
| Assigned Teams: |
Query Optimization
|
||||||||||||||||||||||||||||||||||||||||||||
| Operating System: | ALL | ||||||||||||||||||||||||||||||||||||||||||||
| Steps To Reproduce: | Run in shell (Robo 3T 1.2.1 doesnt know countDocuments, so we wrote it out fully)
|
||||||||||||||||||||||||||||||||||||||||||||
| Sprint: | QO 2022-04-04, QO 2022-04-18, QO 2022-05-02, QO 2022-05-16, QO 2022-05-30, QO 2022-06-13, QO 2022-06-27, QO 2022-07-11, QO 2022-07-25, QO 2022-08-08, QO 2022-08-22, QO 2022-09-05, QO 2022-09-19, QO 2022-10-03, QE 2022-10-17 | ||||||||||||||||||||||||||||||||||||||||||||
| Participants: | |||||||||||||||||||||||||||||||||||||||||||||
| Case: | (copied to CRM) | ||||||||||||||||||||||||||||||||||||||||||||
| Description |
|
Dear Server team, I'm a C# dev on a team using MongoDb and I have the task to clean up the compile warnings. One warning was "Count is obsolete. Use CountDocuments instead" But at every occurrence in the code, at all the lines have the comment "Don't use CountDocuments, it is slower", referencing Jira I did my own tests and indeed found CountDocuments to be about 34% slower then Count (~150 ms longer). That is quite significant and thus a problem for us, as we are already having performance issues. First thinking this was a C# driver issue, I opened a ticket Please read Thank you. |
| Comments |
| Comment by Ewan Higgs [ 19/Jan/23 ] |
|
It's been a big year and MongoDB 6.0 was released which had planned to restore the count command. As this ticket moves from being groomed into sprints to placed in the backlog, may I ask what the current state of count is, countDocuments, estimatedDocumentCount in MongoDB 6.0? Thanks! |
| Comment by Maddie Zechar [ 27/Sep/22 ] |
|
I've continued working on this ticket. When the agg tree is empty (the plan only exists in plan stages like suggested solution), an aggregation request will still invoke runAggregate() which will invoke plan_executor_impl:getNext() (which then calls getNextDocument).The find version of count will also invoke plan_executor_impl:getNextDocument but from executeCount(). They function roughly the same, getNextDocument() will be called in a while loop while the plan executor returns ADVANCED state. After it exits the while loop, the executeCount() retrieves the count from root->getSpecificStats()->nCounted. We can access the same root stats for agg count (via nCounted) and get the correct amount. However, it seems it would make more sense to alter exec/count so that it returns a functional WorkingSetMember and thus not having to alter the logic in getNext (as opposed to adding logic to determine if an executor is running an agg count from inside getNextImpl) |
| Comment by James Wahlin [ 06/Apr/22 ] |
|
ian.boros - thanks for pointing me to this. I tested and we are using this optimization, setting `CursorType::kEmptyDocuments` on DocumentSourceCursor. |
| Comment by Ian Boros [ 05/Apr/22 ] |
|
One other thing to note is that we do already have a special code path for "count like" operations in the DS layer, where we avoid the conversion from BSON -> Document. See here. It might be worth investigating whether this path is used for the queries that regressed. |
| Comment by James Wahlin [ 31/Mar/22 ] |
|
I spent some time investigating this and have an idea on where the aggregate command count falls short vs the count command. The answer is slightly different for an indexed count vs collection scan, so I will present both here. These findings are based on a similar reproduction to the on posted above. It is worth noting that on current master the difference I observed was roughly 16% rather than the 34% reported in this ticket. Indexed Count When running a count command with filter with a backing index, we create a PlanStage tree that consists of a COUNT_SCAN stage with a COUNT stage as parent. This has proven to be the fastest way to generate a count and the baseline we are working against. When running an aggregate command with filter and count with a backing index, we also create a COUNT_SCAN stage to access the data. Where we differ is that we do not push down the accumulator to the PlanStage level and instead maintain a $group/$project in aggregation. This results in transferring data between the PlanStage and aggregation layer (so BSONObj => Document) and at least 1 additional level of recursion, incurring additional cost. Collection Scan Count When running a count command with filter using a collection scan, we build a PlanStage tree with a COLLSCAN stage and a COUNT stage as parent. This is the fastest way to generate a filtered count using a collection scan. When running an aggregate command with filter and count using a collection scan, we build a PlanStage tree with only a COLLSCAN stage, and then handle count generation via $group/$project in the aggregation layer. Similar to the indexed case, the main cost here is the penalty we may for moving from the PlanStage to aggregation layer as well as heavier weight accumulator/project mechanism. Fix Proposal To address the discrepancy in cost using our classic engine I would propose adding optimizations that would allow an aggregation pipeline prefix such as we see here to be pushed down to the PlanStage layer and execute in an identical manner to the count command. |
| Comment by Ewan Higgs [ 10/Dec/21 ] |
|
@james.wahlin, thanks for restating the issues for clarity and pointing me to |
| Comment by James Wahlin [ 06/Dec/21 ] |
|
ewan.higgs@deliverect.com there are a few things worth noting here: First, is that MongoDB can provide an O(1) fast estimated collection count. This count is a statistic maintained as part of collection metadata. Running count({}) (note the empty query filter) or estimatedCount() will return this statistic and will be a fast operation regardless of collection size. This is only available when asking for the count of all documents in the collection. The countDocuments() helper will perform a query to determine the count, regardless of the query filter provided. This is an O( n ) operation and returns an exact count rather than an estimate. What count() provides that the combination of estimatedCount() and countDocuments() doesn't replace is a mechanism to switch between the fast/estimated count and the slower/exact count depending on query filter. Second, when comparing apples to apples for the slower/exact count, the count() helper is roughly 30% faster than the countDocuments() helper. This is the issue we are looking to address under this ticket. Third, the count() helper when run against a non-materialized view will fallback to the slower/exact count against the underlying pipeline and collection. The estimatedCount() helper has no such mechanism and fails when run against a view.
This ticket looks to address the second issue, which is to bring the countDocuments() helper performance in line with the count() helper when run with a non-empty query predicate. For the first and third issues we are looking to address under |
| Comment by Ewan Higgs [ 05/Dec/21 ] |
|
From what we are seeing, the discrepency between count and countDocuments on large collections isn't 34% but thousands of %. I have 2.7m documents in a collection. count({}) takes 2ms. estimatedDocumentCount({}) takes 2ms. countDocuments({}, {hint: "id"}) takes 1475ms. @Jochen.Bonarius' aggregation and/or countDocuments({}) (without a hint) takes between 5185ms and 16000ms. countDocuments is not really usable; and while estimatedDocumentCount is quick, it doesn't work with queries, making it not useful. With the release of pymongo 4.0 removing count, fixing this should be a high priority. |
| Comment by Kyle Suarez [ 25/Jun/21 ] |
|
I'm sorry for the troubles you've been having switching over from count to countDocuments. As I'm sure has been previously mentioned, the drivers have deprecated count due to various issues with the command, including accuracy in certain situations, but I acknowledge that the performance decrease when switching to the accurate count is not a great user experience. I'm adding this ticket to the Query Execution queue and we will schedule this performance improvement work along with our other coming projects. Please continue to watch this ticket for further updates. In terms of immediately addressing the deprecation warning issues, I would recommend you try the workarounds suggested in Regards, |
| Comment by Eric Sedor [ 17/Jun/21 ] |
|
jochem@netwinst.nl does the syntax firstname.lastname work for you? |
| Comment by Jochem Bonarius [ 17/Jun/21 ] |
|
@Eric Sedor Thank you, please let me know what I/we can do to assist. p.s. can someone tell me how to tag a user on Jira? The @ trick doesn't seem to work. |
| Comment by Eric Sedor [ 16/Jun/21 ] |
|
Thanks jochem@netwinst.nl, mechanically, it does seem like count() with a query predicate should be similar to countDocuments, but I'll pass this to an appropriate team to consider in detail. |