-
Type:
Improvement
-
Resolution: Won't Do
-
Priority:
Major - P3
-
None
-
Affects Version/s: None
-
Component/s: None
-
None
-
Query Optimization
-
None
-
None
-
None
-
None
-
None
-
None
-
None
After SERVER-83441, we use BinaryOp<And> with a pushed-down projection as input for every top-level field, instead of using PathComposeM + PathGets to retrieve each field from a single root projection. There is one case where this can generate a more costly physical plan. Consider the following example query:
{$and: [ { a: 1 }, { a: 2 }, { a: { $gt: 1 }} ]}
Note that unfortunately we don't do interval simplification for this case yet. The plan we generate now post-pushdown is:
Filter []
| BinaryOp [And]
| | EvalFilter []
| | | Variable [p2]
| | PathTraverse [1] PathCompare [Eq] Const [1]
| BinaryOp [And]
| | EvalFilter []
| | | Variable [p2]
| | PathTraverse [1] PathCompare [Eq] Const [1]
| EvalFilter []
| | Variable [p2]
| PathTraverse [1] PathComposeM []
| | PathCompare [Lt] Const [""]
| PathCompare [Gt] Const [1]
PhysicalScan [{'<root>': p0, 'a': p2}, a_e4934641-532e-4772-9816-7cc580c80b28]
Its good that we can reuse the projection for a- however, we traverse the output of p2 three times (n times for n predicates on the same field). We should avoid the extraneous traverses in this case by generating a plan closer to:
Filter [] | EvalFilter [] | | Variable [p2] |. PathTraverse [1] |. PathComposeM | |. PathCompare [Eq] Const [1] | PathComposeM | | PathCompare [Eq] Const [1] | PathComposeM [] | | PathCompare [Lt] Const [""] | PathCompare [Gt] Const [1] ...
We would see a performance gain in particular for large collections where field "a" includes large arrays/ the traversal step is expensive.