As part of the work to improve the $out stage, we would like to work towards a model where multiple shards can perform the "merger" work for the pipeline in parallel. Specifically, when we are outputting to a sharded collection from a sharded collection, we would like each shard to partition the output data to each other shard based on the output shard key. This is in contrast to our current sharded execution strategy, which would gather all the results from each shard to one "merger" process, which would then have to scatter the writes across the cluster.
We believe the "exchange" operator/model would be useful here, and that the AsyncResultsMerger fits pretty well as the consumer part of the exchange. We also believe that the TeeBuffer class is pretty close to acting as a producer, but needs to be remodeled and adapted to serve requests from multiple threads.