Investigate clone DDL scalability with large numbers of tracked collections

XMLWordPrintableJSON

    • Type: Task
    • Resolution: Unresolved
    • Priority: Major - P3
    • None
    • Affects Version/s: None
    • Component/s: None
    • Catalog and Routing
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      The clone DDL is run during an upgrade to clone all database and collection metadata to shards into the authoritative collections. To clone a collection, the shardsvrFetchCollectionMetadata command is sent. If the cluster has a large number of collections, the clone DDL will issue one command per shard per collection, and each command will fetch information from the CSRS. All of this applies only to tracked collections.

      This ticket is to investigate whether this is an issue:

      1. Investigate the number of tracked collections customers have in Atlas in order to quantify whether there is something to optimize.
      2. If the issue is concerning, send a single command per shard that will be responsible for fetching all collection metadata across all collections from a given database.

      The plan is to investigate point 1 and, if the findings are acceptable, conclude that point 2 is not needed.

            Assignee:
            Unassigned
            Reporter:
            Pol Pinol
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

              Created:
              Updated: