[SERVER-84187] Regression in balancer round storage statistics retrieval Created: 14/Dec/23  Updated: 21/Dec/23

Status: Open
Project: Core Server
Component/s: None
Affects Version/s: 7.0.0, 7.2.0-rc0, 7.1.0, 7.3.0-rc0
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Tommaso Tocci Assignee: Backlog - Catalog and Routing
Resolution: Unresolved Votes: 0
Labels: balancer-round-perf, car-qw
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Problem/Incident
is caused by SERVER-66297 Get rid of `maxSize` for shards Closed
Assigned Teams:
Catalog and Routing
Operating System: ALL
Participants:
Story Points: 2

 Description   

In every balancing round, we retrieve per-shard storage statics. Since this is done serially, in a cluster with several shards this can be particularly slow, limiting considerably the balancing speed.

Back in SERVER-30060 we did an optimization to retrieve storage statistics only from shards that have maxSize configured.

Recently as part of SERVER-66297 we removed this optimization, so now we retrieve again statistics from all the shards serially at every round.

The implementation of these statistics retrieval is affected by multiple performance issues:

 

            Proposal: Do it only once per round and re-use the same ClusterStatistcs object


Generated at Thu Feb 08 06:54:18 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.