[SERVER-9478] sh.startBalancer() should retry to change the settings while waiting Created: 26/Apr/13  Updated: 06/Dec/22  Resolved: 12/Jul/18

Status: Closed
Project: Core Server
Component/s: Sharding
Affects Version/s: 2.2.4, 2.4.3
Fix Version/s: None

Type: Improvement Priority: Minor - P4
Reporter: Thomas Rueckstiess Assignee: [DO NOT USE] Backlog - Sharding Team
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
Assigned Teams:
Sharding
Participants:

 Description   

Currently, running sh.startBalancer() seems to only call sh.setBalancerState(true) once at the beginning, then waits for the balancer lock to change (by default 30 sec).

This is problematic for scripted backup procedures along these steps

  1. stop balancer
  2. stop a config server
  3. back up ...
  4. start the config server
  5. start balancer

If there are no hard-coded timeouts between 4. and 5. the config server may miss the call to sh.setBalancerState(true) (or reject it, if not all 3 are running yet) and then wait in the loop until timed out.

Instead, it would be better if sh.startBalancer() kept pushing the settings while in the wait loop. If the config server comes up in the mean time, it would receive the next state change and the balancer would start.



 Comments   
Comment by Gregory McKeon (Inactive) [ 12/Jul/18 ]

Now that the balancer is on the config server, this has gone away.

Generated at Thu Feb 08 03:20:31 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.