Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-91030

analyzeShardKey against sharded collections may fail to find sampled docs if there has been data movement

    • Type: Icon: Bug Bug
    • Resolution: Won't Fix
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: None
    • Component/s: None
    • None
    • Cluster Scalability
    • ALL
    • v8.0
    • Cluster Scalability 2024-07-08, Cluster Scalability 2024-07-22, Cluster Scalability 2024-08-19, Cluster Scalability 2024-09-02, Cluster Scalability 2024-10-14, Cluster Scalability 2024-10-28
    • 0

      To calculate the cardinality and frequency metrics for a shard key that is not unique, the analyzeShardKey command internally runs a cluster aggregate command. To avoid doing a COLLSCAN or an IXSCAN + FETCH stage that may be incurred by shard filtering, the command is run with readConcern "available", and we explicitly document that orphan documents are not excluded from metrics calculation for performance reasons. Using readConcern "available" causes commands to skip both shard version check and shard filtering. Without shard version check, if the cluster aggregate command is routed to the old owning shard(s), it is possible that the no documents would be found as shown in BF-30328. 

            Assignee:
            cheahuychou.mao@mongodb.com Cheahuychou Mao
            Reporter:
            cheahuychou.mao@mongodb.com Cheahuychou Mao
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated:
              Resolved: