Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-75594

Make analyzeShardKey command only use one config collection to store split points

    XMLWordPrintableJSON

Details

    • Icon: Task Task
    • Resolution: Fixed
    • Icon: Major - P3 Major - P3
    • 7.0.0-rc0
    • None
    • None
    • None
    • Fully Compatible
    • Sharding NYC 2023-04-17
    • 25

    Description

      Currently, the step for calculating the read and write distribution metrics in the analyzeShardKey command works as follows:

      1. The shard running the analyzeShardKey command generates the split points for the shard key being analyzed, and then persists them in a temporary config collection named config.analyzeShardKey.splitPoints.<collUuid>.<tempCollUuid>.
      2. The shard sends an aggregate command with the $_analyzeShardKeyReadWriteDistribution stage to all shards that own chunks for the collection. The stage has the following spec:

        {
             key: <object>,
             splitPointsNss: <namespace>
             splitPointsAfterClusterTime: <timestamp>,
             splitPointsShardId: <shardId> // Only set when running on a sharded cluster
        }
        

      3. The shards running the $_analyzeShardKeyReadWriteDistribution stage then fetch all split point documents in the collection 'splitPointsNss' and create a synthetic routing table and do metrics calculation based on that.
      4. The shard from step 1 combines the metrics and drop the split point collection.

      Making each command have its own split point collection is clean in a way but it has some notable downsides:

      • If an interrupt (e.g. due to shutdown) occurs in step 4, the temporary collection would never get dropped. This can be seen in BFG-1859782, BFG-1859158, BFG-1858992 and more.
      • It is expensive to keep creating and dropping collections. The command currently also need to drop the collection before it can return.  
         

      One can try to set up a periodic job to drop collections with this "config.analyzeShardKey.splitPoints.*" prefix. However, it is hard to differentiate between dangling collections and collections that are actually being used by some in-progress analyzeShardKey command.

      Given this, the analyzeShardKey command should instead only use one config collection for storing split points, and rely on a TTL index to automatically clean up documents for a command that has already returned. That is, the stage should have the following spec, where 'splitPointsFilter' is the filter that only match the split point documents generated by a particular command.

      {
           key: <object>,
           splitPointsFilter: <object>
           splitPointsAfterClusterTime: <timestamp>,
           splitPointsShardId: <shardId>
      }
      

      The users are likely to run a lot of analyzeShardKey commands back to back so there shouldn't be a lot of documents to filter out during the read.

      Attachments

        Activity

          People

            cheahuychou.mao@mongodb.com Cheahuychou Mao
            cheahuychou.mao@mongodb.com Cheahuychou Mao
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: