[SERVER-49785] Write and test aggregation pipeline for collection bulk loader for resharding Created: 21/Jul/20  Updated: 29/Oct/23  Resolved: 17/Sep/20

Status: Closed
Project: Core Server
Component/s: Sharding
Affects Version/s: None
Fix Version/s: 4.8.0

Type: Task Priority: Major - P3
Reporter: Max Hirschhorn Assignee: Misha Tyulenev
Resolution: Fixed Votes: 0
Labels: PM-234-M2, PM-234-T-data-clone
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
Gantt Dependency
has to be done before SERVER-49787 Create collection bulk loader for res... Closed
has to be done before SERVER-51005 Add support for hashed shard keys to ... Closed
has to be done after SERVER-49289 Support specifying a collection by it... Closed
has to be done after SERVER-49290 Support running $lookup locally on sh... Closed
has to be done after SERVER-49214 Add $toHashedIndexKey expression Closed
Problem/Incident
Backwards Compatibility: Fully Compatible
Sprint: Sharding 2020-08-24, Sharding 2020-09-21
Participants:
Linked BF Score: 0

 Description   

The goal of this ticket is to create a function which allows the aggregation pipeline to easily be sent to a remote (donor) shard and for the aggregation pipeline to be unit-testable with DocumentSourceMock. It is preferable to use DocumentSourceXX::create() functions (or DocumentSourceXX::parseFromBSON() when the former isn't available or is too tedious) rather than building it up with string concatenation. DocumentSources can be conditionally added to the Pipeline::SourceContainer, for example, to reflect a stage being added only when resuming on a new cursor.

std::unique_ptr<Pipeline, PipelineDeleter> createCloningPipelineForResharding(
    ShardKeyPattern newShardKeyPattern,
    NamespaceString sourceNss,  /* nss of the collection being resharded */
    BSONObj startAfter,  /* expected to be an object of the form {_id: <any>} or isEmpty() */
    ShardId recipientShard
);

Some of these parameters are probably more appropriate to take by const-ref because their contents can only be copied into the Pipeline anyway.



 Comments   
Comment by Githook User [ 17/Sep/20 ]

Author:

{'name': 'Misha Tyulenev', 'email': 'misha.tyulenev@mongodb.com'}

Message: SERVER-49785 aggragation pipeline for collection bulk loader for resharding
Branch: master
https://github.com/mongodb/mongo/commit/7c83537305f9d0c626fefa82c6acbd9e5fd95222

Comment by Githook User [ 16/Sep/20 ]

Author:

{'name': 'Max Hirschhorn', 'email': 'max.hirschhorn@mongodb.com', 'username': 'visemet'}

Message: Revert "SERVER-49785 aggragation pipeline for collection bulk loader for resharding"

This reverts commit e7b7855c41c33ecc8383e307d28b6ee17d92a906.
Branch: master
https://github.com/mongodb/mongo/commit/547823d400ce7ada24e3dde37f07e59845260ffa

Comment by Githook User [ 16/Sep/20 ]

Author:

{'name': 'Misha Tyulenev', 'email': 'misha.tyulenev@mongodb.com'}

Message: SERVER-49785 aggragation pipeline for collection bulk loader for resharding
Branch: master
https://github.com/mongodb/mongo/commit/e7b7855c41c33ecc8383e307d28b6ee17d92a906

Generated at Thu Feb 08 05:20:48 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.