Details
-
Bug
-
Resolution: Done
-
Minor - P4
-
v1.0
-
None
Description
The current documentation is not accurate with the aggregation framework's behavior.
When operating on a sharded collection, the aggregation pipeline is split into two parts. The aggregation framework pushes all of the operators up to and including the first $group or $sort to each shard. Then, a second pipeline on the mongos runs. This pipeline consists of the first $group or $sort and any remaining pipeline operators, and runs on the results received from the shards.
The mongos pipeline merges $sort operations from the shards. The $group operator brings in any “sub-totals” from the shards and combines them: in some cases these may be structures. For example, the $avg expression maintains a total and count for each shard; mongos combines these values and then divides.
In reality (from redbeard0531):
$sort only happens on the mongos, not on the shards. This will change in 2.4 for a $sort followed by a $limit.