Set higher default value for the number of samples per chunk in SamplingBasedInitialSplitPolicy and define a minimum total number of samples

    • Type: Bug
    • Resolution: Unresolved
    • Priority: Major - P3
    • None
    • Affects Version/s: None
    • Component/s: None
    • Cluster Scalability
    • ALL
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      SERVER-78841 made the number of samples per chunk configurable but it didn't increase the default value (currently 10). Also, having a minimum total number of samples irrespective of the configured number of chunks and the number of samples per chunk ensures that the split points provide good data distribution even when the user configures a low number of chunks or samples per chunk.

      As part of this ticket, we should add integration testing that contains a large number of documents but exactly N unique shard key values and ensure resharding create N chunks with the default settings.

            Assignee:
            Unassigned
            Reporter:
            Cheahuychou Mao
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

              Created:
              Updated: