[SERVER-60849] Remove nested loop joins from $Filter.limit SBE traverse stage Created: 20/Oct/21  Updated: 06/Dec/22

Status: Backlog
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Improvement Priority: Major - P3
Reporter: Maddie Zechar Assignee: Backlog - Query Execution
Resolution: Unresolved Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Assigned Teams:
Query Execution
Participants:

 Description   

Solution proposed by Anton:

Data flow for this expression can be presented as follows:

<main-tree>
This is the eval frame sitting on top of the evalStack when we start processing the $filter expression. This <main-tree> is used as an input to the entire expression (the input argument). So, the input argument can be evaluated on the same eval frame.

<cond-tree>
Implements a filter predicate. It needs to be evaluated in its own frame since the evaluation should diverge from the main data flow - that is, for each element in an input array we will have to execute this <cond-tree>, so the array element will become the input to this tree.

So, putting it together we have:

traverse inSlot outSlot innerSlot
from
project inSlot = <something>
<main-tree>
in
<cond-tree>
Now, lets add the limit argument. It can be a complex expression on its own, but its evaluation doesn't change the data flow. What I mean by that is, we don't have to short-circuit while evaluating the limit argument, and we don't have to create a new evaluation context like we do for the cond argument. We can simply evaluate it sequentially right after the input argument. In other words, it can be evaluated on the same eval frame as the input argument itself.

traverse inSlot outSlot innerSlot {}

{getArraySize(outSlot) >= limitSlot}

from
project limitSlot = <something>
<limit-tree>
project inSlot = <something>
<main-tree>
in
<cond-tree>
Now, there is one caveat here.

The limit argument is being processed last, after the cond argument. The latter adds its own eval frame on the eval stack, so limit expressions would be evaluated against the <cond-tree> frame rather than <main-tree>. To deal with it we will have to pop the <cond-tree> frame off the stack (in in-visitor) and store it in FilterExprFrame, process limit against the main-tree frame, and then in $filter post-visitor grab the cond frame form theFilterExprFrame, rather than from the eval frame.

Of course this all can be avoided if we evaluate the cond argument last, but I absolutely don't want to change the order of the $filter arguments depending on the presence of this optional limit. Optionals should always go last.



 Comments   
Comment by Kyle Suarez [ 02/Nov/21 ]

We don't quite have enough context on this to triage this properly at the meeting. Tagging eric.cox to please speak to anton.korshunov and maddie.zechar.

Comment by Maddie Zechar [ 20/Oct/21 ]

tagging anton.korshunov

Generated at Thu Feb 08 05:50:54 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.