Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-30060

Make the balancer gather storage statistics only for shards which have `maxSize` set

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Closed
    • Major - P3
    • Resolution: Fixed
    • 3.4.6, 3.5.9
    • 3.4.7, 3.5.11
    • Sharding
    • None
    • Fully Compatible
    • v3.4
    • Sharding 2017-07-31

    Description

      The sharding balancer currently issues listDatabases against every single shard in order to access the totalSize value. This value is used for ensuring that a shard's storage maxSize is not exceeded for customers which have that value set.

      The listDatabases call is quite heavy, especially for nodes with large number of databases/collections since it will fstat every single file under the instance.

      There are a number of optimizations we can make in order to make this statistics gathering less expensive (listed in order of preference):

      • Only gather storage statistics for shards which have maxSize set (implemented by this ticket)
      • Issue the listDatabases call in parallel against all shards so it doesn't take so different shards' execution overlaps
      • Cache the per-shard statistics so that they are not collected on every single round/moveChunk invocation
      • Collect the per-shard statistics asynchronously so that multiple concurrent moveChunk requests can benefit
      • Add a parameter to listDatabases to allow it to return cached data size instead of every time {{fstat}}ing all the files

      Attachments

        Issue Links

          Activity

            People

              kaloian.manassiev@mongodb.com Kaloian Manassiev
              kaloian.manassiev@mongodb.com Kaloian Manassiev
              Votes:
              3 Vote for this issue
              Watchers:
              17 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: