[SERVER-28870] make ClusterClientCursorParams::RemoteCursor store the ShardId Created: 19/Apr/17  Updated: 30/Oct/23  Resolved: 24/Apr/17

Status: Closed
Project: Core Server
Component/s: Querying, Sharding
Affects Version/s: 3.5.6
Fix Version/s: 3.5.7

Type: Task Priority: Major - P3
Reporter: Esha Maharishi (Inactive) Assignee: Esha Maharishi (Inactive)
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Backwards Compatibility: Fully Compatible
Sprint: Sharding 2017-05-08
Participants:

 Description   

This is prep work for replacing ClusterAggregate's use of Strategy::commandOp() with establishCursors().

This is because aggregation may need to know which shard a cursor resides on, in order to select a merging shard from the shards involved in the aggregation.

Note david.storch charlie.swanson, it seems plausible to choose any shard to do the merging, not just one that had a cursor established on it. Is it actually necessary to choose one that had a cursor established on it?



 Comments   
Comment by Githook User [ 24/Apr/17 ]

Author:

{u'username': u'EshaMaharishi', u'name': u'Esha Maharishi', u'email': u'esha.maharishi@mongodb.com'}

Message: SERVER-28870 make ClusterClientCursorParams::RemoteCursor store the ShardId
Branch: master
https://github.com/mongodb/mongo/commit/fb19be647818a3625589d51cf471ab4c04e3362c

Comment by David Storch [ 20/Apr/17 ]

Oh, right. I still think we should not change the behavior to potentially select a merging shard which isn't otherwise involved in the execution of the query.

Comment by Tess Avitabile (Inactive) [ 20/Apr/17 ]

I do not believe it's necessary for collation to select a shard amongst those involved in merging. We were concerned that if the merging shard did not have the collection metadata, then it would not have the collection default collation. However, we attach the collection default collation to the merge pipeline before sending it to the merging shard here.

Comment by Esha Maharishi (Inactive) [ 19/Apr/17 ]

david.storch, sounds good. I'll go ahead and put this patch in.

Comment by David Storch [ 19/Apr/17 ]

Yes, it is necessary to select a shard amongst those involved in merging. There is a specific reason related to collation which I can't recall right now but tess.avitabile might remember. But, more broadly, I think we need to do this for performance reasons. If two shards are targeted, but the data is merged on a third shard, this will involve more bytes over the network. If instead one of the shards contributing data also acted as the merger, that shard's data would not have to travel over the network prior to merging. Furthermore, I think it would go against user expectations if an aggregation were to put load on a shard that was not targeted.

Generated at Thu Feb 08 04:19:17 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.