Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-38152

Prune unnecessary branches of the dependency graph in an agg pipeline

    • Type: Icon: Improvement Improvement
    • Resolution: Unresolved
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: None
    • Component/s: Aggregation Framework
    • Labels:
      None
    • Query Optimization
    • QO 2022-09-05, QO 2022-10-03, QE 2022-10-17

      When generating pipelines to handle polymorphic data, the BI-Connector creates a lot of fields that are not necessarily needed for the final output, as a simplified example:

      {$addFields: {x: ..., y:..., z:...}}
      {$project{OUT: "$x"}}
      

      The code for computing y and z, which can be quite expensive, is totally unneeded. This can generalize to any number of stages. Essentially, any fields removed by $projects can be removed from computations proceeding said $project, transitively, e.g.:

      {$addFields: {x: ..., y:..., z:...}}
      {$addFields: {a: {$add: ["$x", "$y"]}, b: {$add: ["$y", "$z"]}}}
      {$project{OUT: "$b"}}
      

      Here, we can remove the computation for a, and then, transitively, x. This has the possibility to drastically improve many types of queries, but will require a field level dependency tracker. This will see benefit for any generated code, not just from the BI-Connector, so it makes more sense to be done within the server.

            Assignee:
            backlog-query-optimization [DO NOT USE] Backlog - Query Optimization
            Reporter:
            patrick.meredith@mongodb.com Patrick Meredith
            Votes:
            0 Vote for this issue
            Watchers:
            8 Start watching this issue

              Created:
              Updated: