[SERVER-86061] In case of EOF optimization and agg pipeline, avoid re-running find query with virtual scan Created: 01/Feb/24 Updated: 06/Feb/24 |
|
| Status: | Backlog |
| Project: | Core Server |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Improvement | Priority: | Major - P3 |
| Reporter: | Ivan Fefer | Assignee: | Backlog - Query Execution |
| Resolution: | Unresolved | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||
| Assigned Teams: |
Query Execution
|
||||||||
| Participants: | |||||||||
| Description |
|
In classic_runtime_planner_for_sbe, if the query has an aggregation pipeline, we will multi-plan only find part of the query and later extend the winning solution with agg pipeline, before executing everything in SBE. However, in cases, when the winning plan reach EOF, we can avoid re-running find part of the query by replacing it with VirtualScan of returned documents. This means the full algorithm will look like this:
One performance consideration for this ticket is that we are running sbe stage builders twice: one time for cache, another time for actual execution, which might be pretty expensive process. However, given that are are avoiding re-reading 101 documents from storage and S2 solution is smaller, this still should be beneficial. Results of this optimization should be verified via performance tests. |