Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-95773

Resharding does not need to sample documents if the key is hashed

    • Cluster Scalability
    • Fully Compatible
    • v8.1, v8.0, v7.0
    • ClusterScalability Apr14-Apr28
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      Resharding uses SamplingBasedSplitPolicy, which is a descendant of the regular InitialSplitSpolicy base class.

      The function calculateHashedSplitPoints defined on the parent is only used in other child classes
      (SplitPointsBasedSplitPolicy::SplitPointsBasedSplitPolicy and AbstractTagsBasedSplitPolicy). The SamplingBasedSplitPolicy does not rely on this method based on the code inspection.

      If the shard key consists of only a hashed field we do not need to sample and can split the space deterministically among the recipients. This allows us to mitigate known issues with the $sample implementation and allow the final distribution of chunks to mirror the distribution of the customer's data without the downsides of sampling.

            Assignee:
            kruti.shah@mongodb.com Kruti Shah
            Reporter:
            lamont.nelson@mongodb.com Lamont Nelson
            Votes:
            0 Vote for this issue
            Watchers:
            10 Start watching this issue

              Created:
              Updated:
              Resolved: