Make analyzeShardKey command support sampling documents

XMLWordPrintableJSON

    • Type: Task
    • Resolution: Fixed
    • Priority: Major - P3
    • 7.1.0-rc0, 7.0.0-rc7
    • Affects Version/s: None
    • Component/s: None
    • None
    • Fully Compatible
    • v7.0
    • Sharding NYC 2023-06-26, Sharding NYC 2023-07-10
    • None
    • 3
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      According to the experiments in SERVER-68763, the aggregate command run by the analyzeShardKey command to calculate the metrics about the characteristics about the shard key can take up to hours to run if the collection contains hundreds of millions of documents and the cardinality of the shard key is also very large. Given this, we should make the command support calculating metrics based on sampled documents instead of all of documents in the collection.

              Assignee:
              Cheahuychou Mao
              Reporter:
              Cheahuychou Mao
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

                Created:
                Updated:
                Resolved: