[SERVER-4521] ability to choose initial split points for sharded collection Created: 19/Dec/11  Updated: 28/Oct/15  Resolved: 19/Dec/11

Status: Closed
Project: Core Server
Component/s: MapReduce, Sharding
Affects Version/s: None
Fix Version/s: 2.1.0

Type: New Feature Priority: Major - P3
Reporter: Antoine Girbal Assignee: Antoine Girbal
Resolution: Done Votes: 0
Labels: rn
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Participants:

 Description   

here is a proposal for a new option "splitPoints" for the "shardCollection" command.
This new option is primarily required for core db working but can also be convenient in user cases.
We sometimes tell people to presplit and predistribute chunks by manual js scripting.

Syntax:
{ shardCollection: "ns", key:

{ a: 1 }

, splitPoints: [ "a", "b", "c" ] }

Use case:

  • for aggregation (MR / aggreg) there is a need to create a sharded output with known split points
  • for users, to avoid migrations when importing a lot data

Implementation:
implementation is trivial since this command already deals with splitting the initial chunks, which is a more complex operation.
Difference is that here, the split points are provided, and chunks are assigned on all shards.

Alternative:
issue split and move chunk command for each chunk.
This requires distributed lock on the ns for each command and can create major delays.

Questions:

  • when picking shards to distribute chunks, should it make sure there is at least 1 chunk on primary shard?
  • internal use only or user facing


 Comments   
Comment by Antoine Girbal [ 19/Dec/11 ]

this feature is not made available via shardCollection command.
It is used by MR and tested in mrShardedOutput.js

Comment by auto [ 19/Dec/11 ]

Author:

{u'login': u'agirbal', u'name': u'agirbal', u'email': u'antoine@10gen.com'}

Message: SERVER-4521: ability to choose initial split points for sharded collection
Branch: master
https://github.com/mongodb/mongo/commit/a60c9e3979c8ba9c0ab15e6a147bb56b18fdcf73

Generated at Thu Feb 08 03:06:14 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.