Loading...

XML

Word

Printable

JSON

Type: Improvement
Resolution: Unresolved
Priority: Major - P3
Fix Version/s: None
Affects Version/s: 4.4.0
Component/s: Aggregation Framework, Performance
Labels:
None

Assigned Teams:

Query Execution
Confidence Status:
None
Work Order:
3
CAR Domain/s:
None

Aha! Reference:
None
Tracking Level:
None
Risk Status:
None
Exec Notes:
None
Goal Name(s):
None
Goal Link:
None

During the initial implementation this was left out of scope, in part due to concerns of resource use for complex nested union pipelines. We are worried about a case where there is a tree of union pipelines like the following.

db.outer.aggregate([
{$unionWith: {
   coll: "x",
   pipeline: [
     {$unionWith: {
        coll: "y",
        pipeline: [...]
      }}
    ]
}}])

Now suppose each of the collections involved were sharded and distributed amongst all shards. In this example, you could imagine a naive parallel processing implementation that starts reading from "outer" at the same time it starts computing the sub-pipeline on "x". Then without further analysis it sends the whole sub-pipeline on "x" to all shards that have data for "x". Once that pipeline arrives on all shards, each shard starts processing the sub-pipeline on "y". Now imagine that innermost sub-pipeline contains more $unionWiths. If there are N shards, then in this naive implementation we have N cursors established (to read "x"), then for each of those cursors N more cursors established to read "y", and so forth. This can lead to an exponential fanout of the number of open cursors (or threads) based on the query itself.

To be clear, today's implementation avoids this problem by entirely executing one branch of the $unionWith before beginning the other. This is a poor workaround since it leaves a lot of performance on the floor.

Assignee:: [DO NOT USE] Backlog - Query Execution
Reporter:: Charlie Swanson
Participants:: [DO NOT USE] Backlog - Query Execution, Charlie Swanson
Votes:: 0 Vote for this issue
Watchers:: 5 Start watching this issue

Created:: May 12 2020 01:17:57 AM UTC
Updated:: May 08 2025 09:10:30 PM UTC

Details

Description

Attachments

Forms

Activity

People

Dates