-
Type: Improvement
-
Resolution: Unresolved
-
Priority: Major - P3
-
None
-
Affects Version/s: None
-
Component/s: None
-
Labels:None
-
Query Execution
Currently this is an antipattern:
{$unwind: "a"}
{$match ...}
{$group: {_id: "$_id", ...}}
because $group is a blocking stage, and can spill if the data is big enough. We recommend something like this instead:
{$set: {a: {$filter ...}}}
This performs better because it operates on one document at a time.
But the first version is nicer in some ways:
- You can easily view intermediate results:
- by commenting out stages,
- or in Compass.
- You might not need to learn two versions of every operator ($match/$filter, $addFields/$map, $group/$reduce).
We could make it perform better by doing a streaming group (in this narrow case).
- Streaming $group is valid when documents are clustered by the group key.
- Documents in a collection are clustered by _id (because we have a unique, non-multikey index on _id).
- $unwind preserves this (if it unwinds one document at a time).
- $match preserves this.
- $project/$set can preserve this, depending on which paths they write.