Loading...

XML

Word

Printable

JSON

Type: Improvement
Resolution: Unresolved
Priority: Major - P3
Fix Version/s: None
Affects Version/s: None
Component/s: None
Labels:
- pmr

Assigned Teams:

Storage Execution
Confidence Status:
None
Work Order:
3
CAR Domain/s:
None

Aha! Reference:
None
Tracking Level:
None
Risk Status:
None
Exec Notes:
None
Goal Name(s):
None
Goal Link:
None

When creating a new collection/database on a sharded cluster, we try to place the new collection/database on the least loaded shard. In order to do so, we need to retrieve and compare total data size between all shards.

Currently, we use the listDatabases command on the primary node of the shard to retrieve the total data size. This command is not very efficient because in order to calculate the total size it will loop through all the collections of every database and for each of them it will fetch the storage size and index size.

For a cluster with a lot of databases/collections, this command can take up to 1 second (SERVER-31020).

The goal of this ticket is to expose and new API that efficiently return the total data size on a specific node.

is depended on by

SERVER-31020 Sharding database creation is slow because of shard disk-usage statistics gathering

Blocked

related to

SERVER-89651 Make selectLeastLoadedNonDrainingShard query shards data size in parallel

Closed

Assignee:: Matt Panton
Reporter:: Tommaso Tocci
Participants:: Matt Panton, Tommaso Tocci
Votes:: 0 Vote for this issue
Watchers:: 10 Start watching this issue

Created:: Nov 20 2024 06:38:45 AM UTC
Updated:: Jan 06 2025 10:05:47 PM UTC

Details

Description

Attachments

Issue Links

Forms

Activity

People

Dates