[SERVER-31586] Recreate PeriodicBalancerSettingsRefresher on the ServiceContext Created: 16/Oct/17  Updated: 30/Oct/23  Resolved: 25/Jul/18

Status: Closed
Project: Core Server
Component/s: Sharding
Affects Version/s: None
Fix Version/s: 4.1.2

Type: Task Priority: Major - P3
Reporter: Dianna Hohensee (Inactive) Assignee: Matthew Saltz (Inactive)
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
depends on SERVER-35486 Create _addShard command on shard to ... Closed
depends on SERVER-35585 Make PeriodicRunner jobs be pausable/... Closed
Backwards Compatibility: Fully Compatible
Sprint: Sharding 2018-06-04, Sharding 2018-06-18, Sharding 2018-07-02, Sharding 2018-07-16, Sharding 2018-07-30
Participants:

 Description   

Originally created by SERVER-30443, then removed by SERVER-31585 for issues that the ticket delineates.



 Comments   
Comment by Githook User [ 25/Jul/18 ]

Author:

{'name': 'Matthew Saltz', 'email': 'matthew.saltz@mongodb.com'}

Message: SERVER-31586 Use PeriodicRunner to refresh balancer configuration periodically
Branch: master
https://github.com/mongodb/mongo/commit/a80d2ae80e0c55f86284b3a5215bfd396b0d4a49

Comment by Githook User [ 09/Jul/18 ]

Author:

{'name': 'Matthew Saltz', 'email': 'matthew.saltz@mongodb.com'}

Message: SERVER-31586 Update balancer configuration on mongod on addShard
Branch: master
https://github.com/mongodb/mongo/commit/5cbed04d484cc92c3c0b571fd49f16b2ceef81b0

Comment by Matthew Saltz (Inactive) [ 07/Jun/18 ]

Talked with Randolph online, and we agreed on:

  1. New command refreshBalancerConfigNow that will be called during addShard on the config server after inserting the shard identity document that triggers sharding initialization
  2. Use the onStepUp and onStepDown functions in the ChunkSplitter to trigger start and stop of a periodic refresher, where the first refresh would happen synchronously.

This will fix test issues, allow for periodic refreshes, handle step-up and step-down, and handle shutdown and restart.

Comment by Matthew Saltz (Inactive) [ 07/Jun/18 ]

        "during addShard, also store the balancer config in the new shard"

So you'd send the balancer configuration inside the document that addShard sends?

If that's the suggestion, that would actually fix test issues, and after that I think it would be sufficient to use the onStepUp and onStepDown functions in the ChunkSplitter to trigger start and stop of a periodic refresher, where the first refresh would happen synchronously.

Comment by Randolph Tan [ 07/Jun/18 ]

What if we treat the balancer config like other sharding metadata (chunk/coll/db)? We persist the info in storage and update when it when we detect a mismatch. So the idea will be:

  • during addShard, also store the balancer config in the new shard
  • during sharding initialization, load the stored balancer config
  • whenever balancer config is refreshed and is different from what we have, update the stored config

This has the benefit of working also on server restarts.

Comment by Matthew Saltz (Inactive) [ 07/Jun/18 ]

esha.maharishi had suggested potentially doing the refresh on shard version mismatch, but I think that wouldn't work until a query for a sharded collection came in. However, we could consider throwing a StaleConfigException (or some new error ShardInitalizationInProgress) on inserts while some flag is set that indicates whether the shard is finished 'initializing' after addShard. Then we'd catch the error and then do the refresh. This seems like it could be useful in general potentially - to have some place not under a lock to finish shard initialization that's allowed to make blocking calls - but it might be a bit extreme to do just for this small change we're making.

Comment by Matthew Saltz (Inactive) [ 06/Jun/18 ]

There are two reasons something like this ticket needs to be done:

  1. Our tests rely on setting the maximum chunk size in the initialization of ShardingTest, and expect that subsequent inserts will split according to that parameter. So, the shard must refresh from the balancer prior to doing any inserts (or change our tests) if we want our tests to be valid.
  2. In general, the logic for "shouldSplit" relies on the current balancer configuration to decide when to split, so if a customer wants to set this value to something smaller than the default, the shard would ideally update it relatively quickly so that chunks don't grow too large before hearing about the update.

There are two things that need to be considered:

  1. When to do the first refresh from the config server
  2. How often to refresh from the config server

For the purposes of tests, all we need is for the first refresh to happen when the shard is added to the cluster with the addShard command. My first thought was to piggy-back on the addShard path in the shard, but no blocking/networking calls can be made in ShardingState::initializeFromShardIdentity because it's under a lock.

So my second thought was this code review, which uses ChunkSplitter::setReplicaSetMode (which is called in initializeFromShardIdentity) to launch a thread to do the refresh, which will also occur whenever step-up occurs. The problem with this is that it's asynchronous and can fail tests non-deterministically and also cause undesirable behavior for customers if for whatever reason it stalls while writes are happening.

This can be addressed by doing the same thing, by essentially having the ChunkSplitter not do any splits until this refresh is done. This would not, however, block inserts, which means it wouldn't necessarily fix our tests (though it would fix issues where a customer increases the chunk size and would not want splits to keep happening until the configuration is refreshed).

Another option would be to have a blocking test-only command refreshBalancerConfigNow that would be used after adding shards in ShardingTest, to fix the testing issues, and then do the previous idea for setReplicaSetMode (which is currently only called on addShard but could be made to happen on onStepUp as well, though there's no reason onStepUp couldn't just block and do the refresh).

I think my favorite idea is a test-only command that blocks and does a refresh, to fix tests. Then we'd have ChunkSplitter start a thread on the first call to setReplicaSetMode to do periodic refreshes, and then stop that thread on onStepDown, and restart it if onStepUp is ever called. For the very first time it starts, we'd ignore requests for splits as described above until some flag like "balancerConfigIsInitialized" is set.

renctan schwerin Do you have any better ideas?

Generated at Thu Feb 08 04:27:32 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.