-
Type: Task
-
Resolution: Fixed
-
Priority: Major - P3
-
Affects Version/s: None
-
Component/s: Sharding
-
Fully Compatible
-
Sharding 2020-09-21, Sharding 2020-10-05, Sharding 2020-10-19, Sharding 2020-11-16
The collection bulk loader for resharding should take the aggregation pipeline from SERVER-49785, run it against a particular donor, and insert the returned documents into the new sharded collection on itself (a recipient).
- Serialize the aggregation pipeline from createCloningPipelineForResharding() using Pipeline::serializeToBson() and use the serialized form to create an AggregationRequest.
- Add the necessary read concern and hint to the AggregationRequest.
- Run the aggregate command on the remote donor. Unclear if this should be using ClusterAggregate::runAggregate() or some other mechanism. sharded_agg_helpers::attachCursorToPipeline() has come up as a way to having the merging happen on the local node (the recipient in the case of this ticket) for computing the new initial split; see also the changes from 262e5a9 as part of
SERVER-49525. - Use Collection::insertDocuments() to insert the documents into the new sharded collection.
- has to be done after
-
SERVER-49785 Write and test aggregation pipeline for collection bulk loader for resharding
- Closed
- is depended on by
-
SERVER-49293 Test collection bulk loader for resharding resuming by largest _id inserted
- Closed
- related to
-
SERVER-52690 Switch ReshardingCollectionCloner to run on a separate TaskExecutor
- Closed