[SERVER-33323] Refactor $mergeCursors stage to allow it to be used to merge cursors on mongos Created: 14/Feb/18  Updated: 29/Oct/23  Resolved: 20/Aug/18

Status: Closed
Project: Core Server
Component/s: Aggregation Framework
Affects Version/s: None
Fix Version/s: 4.1.3

Type: Improvement Priority: Major - P3
Reporter: Charlie Swanson Assignee: Charlie Swanson
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
depends on SERVER-24978 Second batches in aggregation framewo... Closed
is depended on by SERVER-35581 Don't mandate the use of "distanceFie... Backlog
Problem/Incident
Related
related to SERVER-36781 Coverity analysis defect 104978: Unin... Closed
related to SERVER-34009 Remove support for 3.6 $mergeCursors ... Closed
Backwards Compatibility: Fully Compatible
Sprint: Query 2018-03-12, Query 2018-03-26, Query 2018-07-02, Query 2018-07-16, Query 2018-07-30, Query 2018-08-13, Query 2018-08-27
Participants:
Linked BF Score: 75

 Description   

This can be done as follow-on work after SERVER-24978, and will remove a layer of indirection between the pipeline executing on mongos and the AsyncResultsMerger. Today, the pipeline is attached via a DocumentSourceRouterAdapter which draws results from a RouterStageMerge which draws results from an AsyncResultsMerger.

Once the $mergeCursors stage is using the AsyncResultsMerger, we can collapse that picture and eliminate the DocumentSourceRouterAdapter and RouterStageMerge.



 Comments   
Comment by Githook User [ 20/Aug/18 ]

Author:

{'name': 'Charlie Swanson', 'email': 'charlie.swanson@mongodb.com', 'username': 'cswanson310'}

Message: SERVER-33323 Refactor cluster_aggregate logic

Attempts to make it more obvious how commands for the shards are
generated while also removing some methods from the Pipeline API.
Branch: master
https://github.com/mongodb/mongo/commit/0ef6e36f8c1d79801bca13b2adcf4a908a2bb720

Comment by Githook User [ 20/Aug/18 ]

Author:

{'name': 'Charlie Swanson', 'email': 'charlie.swanson@mongodb.com', 'username': 'cswanson310'}

Message: SERVER-33323 New cluster_aggregate library

This new library contains both cluster_aggregate.cpp and
cluster_aggregation_planner.cpp. Both of these files are moved to the
src/mongo/s/query directory where the new library lives.
Branch: master
https://github.com/mongodb/mongo/commit/bb9f6662e1f98b633df4d22082b5810d786fb620

Comment by Githook User [ 15/Aug/18 ]

Author:

{'name': 'Siyuan Zhou', 'email': 'siyuan.zhou@mongodb.com', 'username': 'visualzhou'}

Message: SERVER-33323 Fix lint.
Branch: master
https://github.com/mongodb/mongo/commit/398ad66cb2646a0ccdd793f2efb4c576749a208a

Comment by Githook User [ 15/Aug/18 ]

Author:

{'username': 'cswanson310', 'email': 'charlie.swanson@mongodb.com', 'name': 'Charlie Swanson'}

Message: SERVER-33323 Refactor agg cursor merging on mongos

This commit makes it so that aggregations will always use a
$mergeCursors as a wrapper around a AsyncResultsMerger, which is new
behavior for mongos. As part of this refactor, we can delete the concept
of a 'merging presorted' $sort stage (which is now handled by the
AsyncResultsMerger) and delete the DocumentSourceRouterAdapter stage
which talked to a RouterStageMerge, instead directly using a
$mergeCursors stage.
Branch: master
https://github.com/mongodb/mongo/commit/ee06e6cbe5a75775f76836449558be2f6a98ddfd

Comment by Charlie Swanson [ 06/Apr/18 ]

Recent commits should have fixed the failures that lead to the revert. Putting this back in Needs Triage.

Comment by Githook User [ 06/Apr/18 ]

Author:

{'email': 'charlie.swanson@mongodb.com', 'name': 'Charlie Swanson', 'username': 'cswanson310'}

Message: SERVER-33323 Fix pushBack, remove const from size_t, and fix s390x

Check if the pipeline is empty before setting the new stage to point to
the last one in Pipeline::pushBack().

Remove unnecessary const qualifier from std::size_t return types.

Work around a compiler bug on s390x by allowing a CursorResponse to be
copied.
Branch: master
https://github.com/mongodb/mongo/commit/74aa83de852e3f3740eccb065a7d43d7708e138a

Comment by Githook User [ 06/Apr/18 ]

Author:

{'email': 'charlie.swanson@mongodb.com', 'name': 'Charlie Swanson', 'username': 'cswanson310'}

Message: SERVER-33323 Use the IDL to serialize the ARM
Branch: master
https://github.com/mongodb/mongo/commit/41f13212be110fc2360804fc04982273e43910f4

Comment by William Schultz (Inactive) [ 05/Apr/18 ]

Author:

{'email': 'william.schultz@mongodb.com', 'name': 'William Schultz', 'username': 'will62794'}

Message: Fix merge conflict errors

Branch: master
https://github.com/mongodb/mongo/commit/a5dacf7092f51055dd774a1911a48815bb9a1e0e

Comment by Githook User [ 05/Apr/18 ]

Author:

{'email': 'william.schultz@mongodb.com', 'name': 'William Schultz', 'username': 'will62794'}

Message: Revert "SERVER-33323 Use the IDL to serialize the ARM"

This reverts commit 7d09f278a2acf9791b36927d6af1d30347d60391.
Branch: master
https://github.com/mongodb/mongo/commit/e88c6d85036607ddf86105234917b4adfffbd612

Comment by Charlie Swanson [ 04/Apr/18 ]

We've committed part one of this refactor, which has the benefit of allowing us to delete some old code as soon as we stop supporting 3.6 mongos. The work outlined in this ticket is not done though. I'm throwing this back into Needs Triage and shifting focus elsewhere for now.

Comment by Githook User [ 04/Apr/18 ]

Author:

{'email': 'charlie.swanson@mongodb.com', 'name': 'Charlie Swanson', 'username': 'cswanson310'}

Message: SERVER-33323 Use the IDL to serialize the ARM
Branch: master
https://github.com/mongodb/mongo/commit/7d09f278a2acf9791b36927d6af1d30347d60391

Comment by Githook User [ 04/Apr/18 ]

Author:

{'email': 'mark.benvenuto@mongodb.com', 'name': 'Mark Benvenuto', 'username': 'markbenvenuto'}

Message: SERVER-33323 Add basic.h to CPP files generated by IDL

Signed-off-by: Charlie Swanson <charlie.swanson@mongodb.com>
Branch: master
https://github.com/mongodb/mongo/commit/09253ad8f4187f4e7e4c453cc157362d751e0918

Comment by Charlie Swanson [ 06/Mar/18 ]

Code review for refactor: https://mongodbcr.appspot.com/193390001/

This patch just uses the IDL to serialize/deserialize the AsyncResultsMerger, and does not include all the changes described in this ticket. More to come later.

Comment by Charlie Swanson [ 14/Feb/18 ]

david.storch I tentatively assigned this to myself for next sprint. We can re-evaluate at the next sprint planning to see if that still makes sense, but hopefully it will. The alternative is leaving it on the backlog user and in the epic until SERVER-24978 is done.

Also, it's not technically required for the epic, so feel free to take it out if you'd like.

Generated at Thu Feb 08 04:33:02 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.