Loading...

XML

Word

Printable

JSON

Type: Improvement
Resolution: Fixed
Priority: Major - P3
Fix Version/s: 3.4.7, 3.5.11
Affects Version/s: 3.4.6, 3.5.9
Component/s: Sharding
Labels:
None

Backwards Compatibility:
Fully Compatible
Backport Requested:

v3.4
Sprint:
Sharding 2017-07-31
Case:
CAR Domain/s:
None

Aha! Reference:
None
Tracking Level:
None
Risk Status:
None
Exec Notes:
None
Goal Name(s):
None
Goal Link:
None

The sharding balancer currently issues listDatabases against every single shard in order to access the totalSize value. This value is used for ensuring that a shard's storage maxSize is not exceeded for customers which have that value set.

The listDatabases call is quite heavy, especially for nodes with large number of databases/collections since it will fstat every single file under the instance.

There are a number of optimizations we can make in order to make this statistics gathering less expensive (listed in order of preference):

Only gather storage statistics for shards which have maxSize set (implemented by this ticket)
Issue the listDatabases call in parallel against all shards so it doesn't take so different shards' execution overlaps
Cache the per-shard statistics so that they are not collected on every single round/moveChunk invocation
Collect the per-shard statistics asynchronously so that multiple concurrent moveChunk requests can benefit
Add a parameter to listDatabases to allow it to return cached data size instead of every time {{fstat}}ing all the files

related to

SERVER-34819 Optimize the sharding balancer's cluster statistics gathering

Closed

Assignee:: Kaloian Manassiev
Reporter:: Kaloian Manassiev
Participants:: Arie Grapa, Githook User, Kaloian Manassiev
Votes:: 3 Vote for this issue
Watchers:: 17 Start watching this issue

Created:: Jul 07 2017 08:34:10 PM UTC
Updated:: Oct 30 2023 11:15:22 PM UTC
Resolved:: Jul 17 2017 05:38:29 PM UTC

Details

Description

Attachments

Issue Links

Activity

People

Dates