Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-4521

ability to choose initial split points for sharded collection

    • Type: Icon: New Feature New Feature
    • Resolution: Done
    • Priority: Icon: Major - P3 Major - P3
    • 2.1.0
    • Affects Version/s: None
    • Component/s: MapReduce, Sharding
    • Labels:

      here is a proposal for a new option "splitPoints" for the "shardCollection" command.
      This new option is primarily required for core db working but can also be convenient in user cases.
      We sometimes tell people to presplit and predistribute chunks by manual js scripting.

      Syntax:
      { shardCollection: "ns", key:

      { a: 1 }

      , splitPoints: [ "a", "b", "c" ] }

      Use case:

      • for aggregation (MR / aggreg) there is a need to create a sharded output with known split points
      • for users, to avoid migrations when importing a lot data

      Implementation:
      implementation is trivial since this command already deals with splitting the initial chunks, which is a more complex operation.
      Difference is that here, the split points are provided, and chunks are assigned on all shards.

      Alternative:
      issue split and move chunk command for each chunk.
      This requires distributed lock on the ns for each command and can create major delays.

      Questions:

      • when picking shards to distribute chunks, should it make sure there is at least 1 chunk on primary shard?
      • internal use only or user facing

            Assignee:
            antoine Antoine Girbal
            Reporter:
            antoine Antoine Girbal
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

              Created:
              Updated:
              Resolved: