Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-49306

Optimization for mid-pipeline $project stages

    • Type: Icon: Improvement Improvement
    • Resolution: Duplicate
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: None
    • Component/s: Aggregation Framework
    • Labels:
      None

      In general, aggregation pipelines are automatically optimized by the server to include only the data which is to be used in the pipeline and/or output at the end. Adding a $project in earlier stages doesn't really limit the amount of data being used in the stages that follow, because the pipeline dependency analysis automatically figures out which fields are needed by the pipeline. Adding $project stages in mid-pipeline can therefore be redundant, and it can prevent pipeline dependency analysis from figuring out which fields are needed by the pipeline (which it does automatically).

      The $project is typically intended only to rename fields or reshape data to be output, and therefore, in most cases, $project should only be placed at the end of an aggregation pipeline and can be avoided in many cases.

      This is an enhancement request for optimizing mid-pipeline $project stages, and possibly convert them to $addFields, so that they don't interfere with the pipeline dependency analysis.

            Assignee:
            asya.kamsky@mongodb.com Asya Kamsky
            Reporter:
            harshad.dhavale@mongodb.com Harshad Dhavale
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated:
              Resolved: