-
Type: Improvement
-
Resolution: Unresolved
-
Priority: Major - P3
-
None
-
Affects Version/s: None
-
Component/s: None
-
Storage Execution
When creating a new collection/database on a sharded cluster, we try to place the new collection/database on the least loaded shard. In order to do so, we need to retrieve and compare total data size between all shards.
Currently, we use the listDatabases command on the primary node of the shard to retrieve the total data size. This command is not very efficient because in order to calculate the total size it will loop through all the collections of every database and for each of them it will fetch the storage size and index size.
For a cluster with a lot of databases/collections, this command can take up to 1 second (SERVER-31020).
The goal of this ticket is to expose and new API that efficiently return the total data size on a specific node.
- is depended on by
-
SERVER-31020 Sharding database creation is slow because of shard disk-usage statistics gathering
- Blocked
- related to
-
SERVER-89651 Make selectLeastLoadedNonDrainingShard query shards data size in parallel
- Closed