[SERVER-50857] Improve count() performance in sharded clusters Created: 10/Sep/20 Updated: 02/Nov/20 Resolved: 02/Nov/20 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Querying, Sharding |
| Affects Version/s: | 4.4.0, 4.4.1 |
| Fix Version/s: | None |
| Type: | Improvement | Priority: | Major - P3 |
| Reporter: | Henri-Maxime Ducoulombier | Assignee: | Ian Boros |
| Resolution: | Won't Fix | Votes: | 2 |
| Labels: | count, performance, qexec-team, sharded | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Environment: |
MongoDB community 4.4 on AWS EC2. |
||
| Attachments: |
|
||||||||||||
| Issue Links: |
|
||||||||||||
| Sprint: | Query 2020-10-05, Query 2020-10-19, Query 2020-11-02 | ||||||||||||
| Participants: | |||||||||||||
| Description |
|
There is a huge performance regression when doing a simple count on an array of string in 4.4 (also in 4.4.1). I don't know if this is a 4.4 issue or if it appeared earlier because we are in the process of migration from 3.4 to 4.4. Still, here the complete description of the issue. We have a "test" collection of 2M documents, which almost all have a field "labels" like that (among a lot of other fields, these are quite large documents) :
Running a "distinct" is very fast and seems to be using the index, whichever the test environment is (3.4, 4.4, sharded or not).
But when it comes to counting.... Mongod 3.4, unsharded collection:
Mongod 3.4, sharded collection:
Mongod 4.4.1, sharded collection:
Mongod 4.4.1, unsharded collection:
I am attaching the complete explains for you to see the behavior in the different environments, but as you can see, in the 4.4 sharded collection, each shard does an IXSCAN (60ms), then a FETCH (10s) then a SHARDING FILTER, etc. and what takes time is the fetching and later stages. It does take advantages of the index at all, whereas the single shard / not sharded version does. This has a HUGE impact on performance, and doing that with an aggregate is slow as well (because of the $unwind then $group pattern). In our application, because the aggregate is slow to do that operation (36s), it's WAY faster in 3.4 to use distinct then a count of each value (takes less than a second even in the sharded env with the 2M documents and 120 differents values). This also happens with non-multikey fields / index (as seen in the attached files). |
| Comments |
| Comment by Ian Boros [ 02/Nov/20 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Looks like I missed your response. I'm glad this worked! Closing as "Won't Fix." | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Henri-Maxime Ducoulombier [ 22/Oct/20 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Hi Ian Boros, Thank you for your answer. I had a good rtfm session on the readConcern setting (which I wasn't aware of yet), tested the workaround immediately and it worked! I reproduced the application behavior, listing then counting distinct values on a 5 nodes shard and the run time goes down from 35 to just about 6 seconds. Regarding the orpheans, this is not an issue in our use case because of the balancing window, presplitting and hashed shard index stragegies, but we have to keep that in mind when using the "available" read concern. Here is a sample test script, just in case.
I believe this ticket can be closed, the workaround is a fine solution. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Ian Boros [ 21/Oct/20 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Hi hmducoulombier@marketing1by1.com, I believe you can work around this issue by using readConcern "available." Under this readConcern, the shard filtering stage will not be added, so the query should use the COUNT_SCAN plan. I must emphasize that without shard filtering, the results you get from the server may be inaccurate. At the end of this comment I am providing some example code on how to do this. Unfortunately, after some discussion with other members of the team, we see no way to get the same performance as 3.4 while also providing accurate results. With that in mind, I intend to close this ticket as "Won't Fix." I'll leave it open for now in case you have follow up questions. Here's the example code. The explain command does not support readConcern "available" so to get the complete plan information, I had to turn on profiling and read it out of system.profile. I'm looking into this more.
And the plan used:
Also, to answer your question:
Yes, indexes do include orphan documents. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Henri-Maxime Ducoulombier [ 16/Oct/20 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
In my case, the predicate provided does not include the shard key. In the examples, the "labels" fields is indexed but the shard key is "contact_id" for instance, and the index on labels is only on labels and is not a compound of labels + contact_id, meaning the fix in Following your comment, this raises a question in my mind: do indexes include orphean documents or don't they ?
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Ian Boros [ 29/Sep/20 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
This is also related to the regression reported in | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Henri-Maxime Ducoulombier [ 25/Sep/20 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Thank you for your answer and for pointing out the other issue, which is absolutely related. The solution to make a compound index seems viable at the cost of a much larger index, and we will test it out soon enough, even though the use of the simple index by the query planner would be way nicer. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Eric Sedor [ 24/Sep/20 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Thanks for your patience, hmducoulombier@marketing1by1.com, and for your thorough report. I believe this is because of I am going to pass this ticket on for confirmation and consideration, as an improvement request. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Henri-Maxime Ducoulombier [ 11/Sep/20 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
I just build a 4.2 shard and it has the same behavior. Here is a sample of the explain with executionStats.
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Henri-Maxime Ducoulombier [ 10/Sep/20 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
As far as I understand the doc, the count() method wraps the 2 countDocuments() and estimatedDocumentCount(). Still, the aggregation pipeline should use the index in the sharded collection, and not fetch + merge (imho) | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Henri-Maxime Ducoulombier [ 10/Sep/20 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Edit: this is not on array fields only. |