[SERVER-38152] Prune unnecessary branches of the dependency graph in an agg pipeline Created: 15/Nov/18  Updated: 09/Dec/22

Status: Open
Project: Core Server
Component/s: Aggregation Framework
Affects Version/s: None
Fix Version/s: None

Type: Improvement Priority: Major - P3
Reporter: Patrick Meredith Assignee: Backlog - Query Optimization
Resolution: Unresolved Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
related to SERVER-69361 [CQF] Extend path fusion to better ha... Closed
is related to SERVER-69361 [CQF] Extend path fusion to better ha... Closed
Assigned Teams:
Query Optimization
Sprint: QO 2022-09-05, QO 2022-10-03, QE 2022-10-17
Participants:

 Description   

When generating pipelines to handle polymorphic data, the BI-Connector creates a lot of fields that are not necessarily needed for the final output, as a simplified example:

{$addFields: {x: ..., y:..., z:...}}
{$project{OUT: "$x"}}

The code for computing y and z, which can be quite expensive, is totally unneeded. This can generalize to any number of stages. Essentially, any fields removed by $projects can be removed from computations proceeding said $project, transitively, e.g.:

{$addFields: {x: ..., y:..., z:...}}
{$addFields: {a: {$add: ["$x", "$y"]}, b: {$add: ["$y", "$z"]}}}
{$project{OUT: "$b"}}

Here, we can remove the computation for a, and then, transitively, x. This has the possibility to drastically improve many types of queries, but will require a field level dependency tracker. This will see benefit for any generated code, not just from the BI-Connector, so it makes more sense to be done within the server.



 Comments   
Comment by Svilen Mihaylov (Inactive) [ 20/Sep/22 ]

Solved as part of linked ticket for the new optimizer.

Generated at Thu Feb 08 04:48:06 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.