[CSHARP-4744] Improve optimization of Count with predicate in Group Created: 03/Aug/23 Updated: 09/Aug/23 Resolved: 09/Aug/23 |
|
| Status: | Closed |
| Project: | C# Driver |
| Component/s: | LINQ3 |
| Affects Version/s: | 2.20.0 |
| Fix Version/s: | 2.21.0 |
| Type: | Improvement | Priority: | Unknown |
| Reporter: | Alistair Steele | Assignee: | Robert Stam |
| Resolution: | Done | Votes: | 0 |
| Labels: | LINQ3 | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Documentation Changes: | Not Needed |
| Documentation Changes Summary: | 1. What would you like to communicate to the user about this feature? |
| Description |
|
A conditional count within a GroupBy aggregation is not currently optimised by the AstOptimizer, resulting in a $group that causes all documents to be pushed and then a following $project stage. This has a significant impact on grouping, particularly when the group is actually just leveraging an index. With the unoptimised post-$group $project stage, the entire document set has to be collated, whereas with the $project optimised into the group (and with an appropriate index on the collection) the $group becomes non-blocking, performing at basically the same speed regardless of collection size. The following demonstrates the difference in queries
As you can see, once we make the Count filtered it fails to optimise and falls back to a full $push: "$$ROOT". I think this can be optimised by converting the expression from a $size to a:
I had a scan of the source and I think this would be done in AstGroupingPipelineOptimizer.AccumulatorMover.VisitUnaryExpression.TryOptimizeSizeOfElements. It looks like this currently only optimizes counts for the entire _element set. Demo source code:
|
| Comments |
| Comment by Githook User [ 09/Aug/23 ] |
|
Author: {'name': 'rstam', 'email': 'robert@robertstam.org', 'username': 'rstam'}Message: |
| Comment by Robert Stam [ 08/Aug/23 ] |
|
Thanks for reporting this. I have a fix in code review. |
| Comment by Alistair Steele [ 03/Aug/23 ] |
|
Apologies for the formatting mess, I can't seem to find the edit button either. Edit: And the field in the $sum sample is meant to be $State, not $Change. [robert@mongodb.com]: I edited the sample replacing $Change with $State |
| Comment by PM Bot [ 03/Aug/23 ] |
|
Hi alistair.steele@trapdoorlabs.uk, thank you for reporting this issue! The team will look into it and get back to you soon. |