[SERVER-39191] Performance regression for counts post-sharding Created: 25/Jan/19 Updated: 29/Oct/23 Resolved: 15/Oct/20 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Sharding |
| Affects Version/s: | 4.0.5 |
| Fix Version/s: | 4.9.0 |
| Type: | Improvement | Priority: | Major - P3 |
| Reporter: | Julien Moutte | Assignee: | Ian Boros |
| Resolution: | Fixed | Votes: | 2 |
| Labels: | qexec-team | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Attachments: |
|
||||||||||||||||||||||||||||||||
| Issue Links: |
|
||||||||||||||||||||||||||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||||||||||||||||||||||||||
| Sprint: | Query 2020-10-05, Query 2020-10-19 | ||||||||||||||||||||||||||||||||
| Participants: | |||||||||||||||||||||||||||||||||
| Case: | (copied to CRM) | ||||||||||||||||||||||||||||||||
| Description |
|
To filter orphans during counts, sharded clusters avoid COUNT_SCAN in favor of an IXSCAN, and examine documents. Especially for queries that do not include the shard key, this may be difficult to avoid. But, the impact can be noticeable for some workloads. Original report follows: I have a simple collection with 40M documents distributed ver 2 shards. The documents contain a field named COUNTRY with a sparse index on it. I am trying to count documents with a specific value for COUNTRY. When counting on the shards directly the plan looks correct and uses COUNT_SCAN with the index, returns 6M docs on each shard without loading them.
Running the exact same query on MONGOS, the count is distributed to the shards but I can see in the logs of each shard that the plan changed and it used IDX_SCAN instead and loaded all the documents which was much slower.
Can someone explain me why the plan changed or why the docs have to be scanned ? The performance impact is huge... Attached execution plan from MONGOS. This issue was also referenced in StackExchange |
| Comments |
| Comment by Githook User [ 15/Oct/20 ] |
|
Author: {'name': 'Sebastian Mayr', 'email': 'smayr@atlassian.com'}Message: When counts were made orphan-aware the addition of the SHARD_FILTER However, there can not be any orphans if the shard key is fully This patch changes the query system to omit the SHARDING_FILTER stage in cases Closes #1369. Signed-off-by: Ian Boros <ian.boros@mongodb.com> |
| Comment by Sebastian Mayr [ 27/Aug/20 ] |
|
I opened a PR with what I think should be the correct changes on GH, but it probably needs a bit of attention from a Mongo team member: https://github.com/mongodb/mongo/pull/1369 |
| Comment by Garaudy Etienne [ 27/Aug/20 ] |
|
Based on sebmaster's suggestion, query seems best equipped to make the right optimizations here. kateryna.kamenieva |
| Comment by Sebastian Mayr [ 05/Jun/20 ] |
|
We've encountered this issue quite recently as well. We had the following plan to lessen the impact of this considerably in one special case: Counting with a fully specified $eq shard key. The idea is that in those cases, mongos would only send the count request to a single shard that is responsible for the respective chunk. This should guarantee that the shard cannot have orphans matching the query. Could someone check our thinking on that? If this is correct, I have some code + tests already written up that strips the sharding filter in this case that we'd like to contribute. |
| Comment by Asya Kamsky [ 11/Mar/20 ] |
|
Note that sharding filter still has to be applied, but it can be done from an index, if the index covers query field PLUS shard filter field(s). |
| Comment by Asya Kamsky [ 11/Mar/20 ] |
|
I believe this is (somewhat) a duplicate of |
| Comment by Julien Moutte [ 01/Feb/19 ] |
|
Sure, let's try. Our use case is basically a graphical query builder where users can add filtering criteria progressively to segment a big data collection. They start with the whole data collection and they see the total count or a count of distinct values on a specific indexed field, they can then incrementally add filtering criteria and see the size of each segment as they build that segmentation model. The count helps building the model and it should come back fast. Accuracy does not have to be 100%, counting distinct entries with hyperloglog is a possibility. For example you could be querying a massive events table from IOT devices belonging to users and you would want to incrementally segment that user base based on criteria applied to that events collection. Hope this is clear enough. |
| Comment by Alyson Cabral (Inactive) [ 31/Jan/19 ] |
|
Thanks, Julien for the info! Without going into the specific intricacies of your data (if possible), can you describe what value counts bring to your user-facing application? Having a well-rounded understanding of use cases for count usage in applications will really help us prioritize this work appropriately.
|
| Comment by Julien Moutte [ 31/Jan/19 ] |
|
Thanks Eric for the feedback. This is a POC to evaluate MONGO for a Big Data segmentation use case: Given a massive collection of documents with proper indices we want to count matching documents based on a set of criteria. Sharding is used to distribute both storage and workload in a scale-out scenario. Initial tests were done on a non sharded database and the query execution time was okayish (2.7s) for a simple count on an indexed field. We were expecting that sharding would add 2 extra qualities :
The opposite happens. When querying shards through mongos, the same query runs in (34s) as documents are examined for this orphaned documents counting. One could argue that quality 2) is somehow still there but the execution time is way too long for our user interface. I understand the reason why this happens, so we should change the issue type to improvement instead of bug. |
| Comment by Eric Sedor [ 29/Jan/19 ] |
|
Hi Julien, thanks for your patience. To accurately count documents in a sharded environment, shards must examine them. This is working as designed as of While it may one day be possible to do this without examining the documents directly, it is unlikely this would be able to be changed soon. That said, we'd like to ask for more information to help us consider this problem. What purpose are the counts serving in your application, are there strict SLAs around the counts in particular, and exactly how much slower did they get for you? In the event overall performance has been impacted by the memory requirement of these counts, can you classify the total impact? Thank you in advance! |