-
Type: Task
-
Resolution: Fixed
-
Priority: Major - P3
-
Affects Version/s: None
-
Component/s: None
-
None
-
Fully Compatible
-
QE 2021-10-04
Initial performance investigations have shown that pushing $group into the find layer with SBE on have a perf benefit over the classic engine when the $group stage has no accumulators. Pushdown of $group and $sum into the find system performs about the same or worse than the classic engine.
We have identified some improvements that can be made and one of those is to eliminate a project above the collection scan in a plan such as,
group_sbe_sum.svg
> coll.explain().aggregate([{$group: {_id: '$_idMod10', s: {$sum: '$price'}}}]); { "explainVersion" : "2", "queryPlanner" : { "namespace" : "test.group_pushdown", "indexFilterSet" : false, "parsedQuery" : { }, "queryHash" : "32904A1C", "planCacheKey" : "7DEED33E", "optimizedPipeline" : true, "maxIndexedOrSolutionsReached" : false, "maxIndexedAndSolutionsReached" : false, "maxScansToExplodeReached" : false, "winningPlan" : { "queryPlan" : { "stage" : "GROUP", "planNodeId" : 3, "inputStage" : { "stage" : "PROJECTION_SIMPLE", <--- eliminate this stage "planNodeId" : 2, "transformBy" : { "_idMod10" : true, "price" : true, "_id" : false }, "inputStage" : { "stage" : "COLLSCAN", "planNodeId" : 1, "filter" : { }, "direction" : "forward" } } },
The sbe plan for the above query plan,
[3] mkbson s11 [_id = s9, s = s10] true false [3] project [s9 = fillEmpty (s6, null), s10 = if (! exists (s8) || typeMatch (s8, 0x00000440), 0, doubleDoubleSumFinalize (s8))] [3] group [s6] [s8 = aggDoubleDoubleSum (s7)] [3] project [s7 = getField (s5, "price")] [3] project [s6 = getField (s5, "_idMod10")] [2] mkbson s5 s3 [_idMod10, price] keep [] true false <--- From PROJECTION_SIMPLE [1] scan s3 s4 none none none none [] @"75ca0299-5f72-4af8-87f2-fb63ac59a4fb" true false
This stage should be removed.