-
Type:
Bug
-
Resolution: Fixed
-
Priority:
Major - P3
-
Affects Version/s: None
-
Component/s: None
-
None
-
Query Execution
-
Fully Compatible
-
ALL
-
QE 2024-03-04, QE 2024-03-18
-
None
-
None
-
None
-
None
-
None
-
None
-
None
In a query
[{ $match: {state: {$in: [1,3,5]}} },{ $group: {_id: "$state", "avgHR": {$avg: "$heartrate"}} }])
where the group key can be processed in block mode but the accumulator can't (until $avg is supported, after that moment the accumulator should become $stdev or something else) the plan that is generated is
[3] group [s22] [s26 = aggDoubleDoubleSum(s20), s27 = sum(
if ((typeMatch(s20, 1088) ?: true) || !(isNumber(s20)))
then 0ll
else 1ll
)] spillSlots[s23, s24] mergingExprs[aggMergeDoubleDoubleSums(s23), sum(s24)]
[3] project [s25 = cellFoldValues_P(cellBlockGetFlatValuesBlock(s10), s10)]
[3] block_to_row blocks[s10, s11, s19] row[s20, s21, s22] s14
[3] project [s19 = valueBlockFillEmpty(cellFoldValues_P(cellBlockGetFlatValuesBlock(s11), s11), null)]
...
[2] ts_bucket_to_cellblock s2 pathReqs[s10 = ProjectPath(Get(heartrate)/Id), s11 = ProjectPath(Get(state)/Id), s12 = FilterPath(Get(state)/Traverse/Id)]
The plan computes in s25 the block version of the $heartrate variable, even if the block_to_row was already inserted and the $group is going to read the scalar version of $heartrate from s20.
Needless to say that running cellFoldValues_P on every time measurement is going to be a waste of time, at least we should run it once per block