Suppose you have a pipeline consisting of a single $match stage which is delivered to mongos by the client. Also suppose that mongos will target multiple shards in order to find the matching data. Currently, the aggregation system will establish a shard as the merging node. Instead, mongos could perform the merging (it already knows how to do this for .find() operations).
This would improve the latency of the aggregation operation, since it would eliminate a network hop. Data currently has to travel from a targeted shard, to the merging shard, to mongos, and back to the client. If mongos were responsible for merging, data would only have to travel from a targeted shard, to mongos, to the client.
Mongos can also merge sorted streams coming from the shards for .find() operations, so this could also be extended to $match+$sort pipelines.