[SERVER-28627] replace MapReduceFinishCommand's use of ParallelSortClusteredCursor with establishCursors()/ARM Created: 04/Apr/17  Updated: 06/Dec/22  Resolved: 23/Sep/19

Status: Closed
Project: Core Server
Component/s: Sharding
Affects Version/s: 3.5.5
Fix Version/s: None

Type: Task Priority: Major - P3
Reporter: Esha Maharishi (Inactive) Assignee: [DO NOT USE] Backlog - Sharding Team
Resolution: Won't Fix Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Assigned Teams:
Sharding
Participants:

 Description   

MapReduceFinishCommand both establishes cursors and merges them using ParallelSortClusteredCursor.

We should be able to replace the establishing cursors part with the new establishCursors() function, but I'm not sure what the best way to replace the merging functionality is.

My first thought is naturally to use the ARM, but currently the ARM is embedded in ClusterClientCursor, and ClusterClientCursor doesn't exist on shards.

david.storch, is there another way we might be able to merge the cursor streams on a shard (DocumentSourceMerge?)? Does the Query team still have plans to make the ARM available on shards?



 Comments   
Comment by David Storch [ 05/Apr/17 ]

Sounds good, feel free to put something on my calendar for next week.

Comment by Esha Maharishi (Inactive) [ 05/Apr/17 ]

Hmm, okay.

The intent of getting rid of ParallelSortClusteredCursor is to remove all uses of old shard versioning logic (things that use ShardConnection/setShardVersion).

However, I don't think mapReduce actually needs the shard versioning logic, since the chunks shouldn't be moved.

Given that there isn't an obvious alternative, I'll think about whether we can rip the shard versioning logic out and convert ParallelSortClusteredCursor into a simple cursor-merging component for shards. I don't love this solution, though, since it means find, agg, and mapReduce will each have their own cursor-merging utilities.

david.storch, let's chat early next week about a final decision?

Comment by David Storch [ 05/Apr/17 ]

esha.maharishi, as far as I know there are three cursor merging implementations:

  • ParallelSortClusteredCursor
  • AsyncResultsMerger
  • DocumentSourceMergeCursors

The query team does not currently have plans to make the ARM available on shards or to abstract the ARM in such a way that makes it useful outside the mongos find/aggregate paths. DocumentSourceMergeCursors is used on mongod, but it probably isn't what you want either, since it's pretty specific to the aggregation subsystem.

What's the goal of getting rid of ParallelSortClusteredCursor in the mapReduce sharded finish logic? Is it mainly just so you can delete ParallelSortClusteredCursor, or was there something related to killing off the old shard versioning logic as well?

Generated at Thu Feb 08 04:18:38 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.