-
Type: Improvement
-
Resolution: Unresolved
-
Priority: Major - P3
-
None
-
Affects Version/s: 4.4.0
-
Component/s: Aggregation Framework, Performance
-
Labels:None
-
Query Execution
During the initial implementation this was left out of scope, in part due to concerns of resource use for complex nested union pipelines. We are worried about a case where there is a tree of union pipelines like the following.
db.outer.aggregate([ {$unionWith: { coll: "x", pipeline: [ {$unionWith: { coll: "y", pipeline: [...] }} ] }}])
Now suppose each of the collections involved were sharded and distributed amongst all shards. In this example, you could imagine a naive parallel processing implementation that starts reading from "outer" at the same time it starts computing the sub-pipeline on "x". Then without further analysis it sends the whole sub-pipeline on "x" to all shards that have data for "x". Once that pipeline arrives on all shards, each shard starts processing the sub-pipeline on "y". Now imagine that innermost sub-pipeline contains more $unionWiths. If there are N shards, then in this naive implementation we have N cursors established (to read "x"), then for each of those cursors N more cursors established to read "y", and so forth. This can lead to an exponential fanout of the number of open cursors (or threads) based on the query itself.
To be clear, today's implementation avoids this problem by entirely executing one branch of the $unionWith before beginning the other. This is a poor workaround since it leaves a lot of performance on the floor.