Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-50597

Explain against a pipeline containing $unionWith does not accurately count executionStats from sub-pipeline

    • Type: Icon: Bug Bug
    • Resolution: Unresolved
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: None
    • Component/s: None
    • Labels:
    • Query Optimization
    • ALL

      When explaining a pipeline containing a $unionWith stage, we first exhaust the pipeline to gather the execution stats. For union, this may or may not involve building and targetting its sub-pipeline depending on the subsequent stages (e.g. $limit may allow us to stream results directly from the base collection without using the pipeline). Next, we will serialize the $unionWith stage, at which point we "re-run" the pipeline and gather the explain output for it.

      There are separate issues with this depending on whether we're running in a sharded cluster:

      1. If not sharded: the executionStats reported for the inner pipeline will not take into account subsequent stages such as $limit which could affect the nReturned stat.
      2. If sharded: The initial pass to exhaust the pipeline does not actually pull from the sub-pipeline since the helper we delegate to will see the explain bit and run a scatter-gather to get the results from each shard instead of establishing cursors to iterate. This means that the union stage essentially gets a no-op pipeline back, and the executionStats reported by the union stage itself will not include any potential documents from the sub-pipeline.

            Assignee:
            backlog-query-optimization [DO NOT USE] Backlog - Query Optimization
            Reporter:
            nicholas.zolnierz@mongodb.com Nicholas Zolnierz
            Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

              Created:
              Updated: