[SERVER-30480] Update aggregation explain format to provide details of merge location Created: 02/Aug/17  Updated: 30/Oct/23  Resolved: 21/Aug/17

Status: Closed
Project: Core Server
Component/s: Aggregation Framework
Affects Version/s: None
Fix Version/s: 3.5.12

Type: Improvement Priority: Major - P3
Reporter: Bernard Gorman Assignee: Bernard Gorman
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
is related to SERVER-22760 Sharded aggregation pipelines which i... Closed
Backwards Compatibility: Fully Compatible
Sprint: Query 2017-08-21, Query 2017-09-11
Participants:

 Description   

Currently, when a sharded pipeline is explained, we expose a boolean field needsPrimaryShardMerger to indicate whether the merge part of the pipeline must be run on the database's primary shard. Given that SERVER-22760 will make it possible for mongoS to merge directly from the shards, this explain format should be updated to provide more detailed information.

Some possible approaches include:

  • A string field mergerType which simply takes the value of mongos or mongod
  • A more detailed merger field identifying the shard or mongos which is to perform the merge. Note that in the case of a shard merge, this information may not be available at explain time.


 Comments   
Comment by Githook User [ 21/Aug/17 ]

Author:

{'username': 'gormanb', 'email': 'bernard.gorman@gmail.com', 'name': 'Bernard Gorman'}

Message: SERVER-30480 Update aggregation explain format to provide details of merge location
Branch: master
https://github.com/mongodb/mongo/commit/6ce6b59a2b0031b212c2bc8ba47ae61b2f81ac46

Comment by David Storch [ 15/Aug/17 ]

bernard.gorman asya I'm on board with the suggestion of reporting only mergerType, which takes exactly one value from the set {"primaryShard", "anyShard", "mongos"}.

Comment by Bernard Gorman [ 09/Aug/17 ]

Thanks asya! I'll wait to see what Dave thinks of this approach before opening a review.

Comment by Asya Kamsky [ 09/Aug/17 ]

I like that: primaryShard, anyShard or mongos. I think that will leave it open to adding "designatedShard" or "allowedShard" when/if we add configuration options to allow restricting "any" shard to a subset of shards...

Comment by Bernard Gorman [ 09/Aug/17 ]

So three values are possible (so far): primary mongod, any mongod, or mongos, correct?

asya: that's correct. At the moment, an explain following SERVER-22760 always produces two boolean fields, needsPrimaryShardMerger and mergeOnMongoS. Dave suggested replacing these with either mongod/mongos or the name of the merging host, but as noted above if it's not the primary shard then the only thing we can do is output a random shard ID, which isn't really useful.

I think the clearest approach would be a single mergeType or mergeLocation field which reflects the values of the new HostTypeRequirement enum:
primaryShard, anyShard, or mongos.

Comment by Asya Kamsky [ 08/Aug/17 ]

The options are

  • to keep needsPrimaryShardMerger and in cases where its value is false to add another field mergerType?
  • replace needsPrimaryShardMerger boolean and have a single field indicating mergeLocation: primary, mongos mongod?

I don't think we can ever know which mongod it will be if it's not the primary. So three values are possible (so far): primary mongod, any mongod, or mongos, correct?

Generated at Thu Feb 08 04:23:58 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.