[SERVER-30058] Balancer policy should not move chunks off shards on 'size exceeded' conditions Created: 07/Jul/17  Updated: 30/Oct/23  Resolved: 13/Jul/17

Status: Closed
Project: Core Server
Component/s: Sharding
Affects Version/s: 3.4.6, 3.5.9
Fix Version/s: 3.4.7, 3.5.10

Type: Bug Priority: Major - P3
Reporter: Kaloian Manassiev Assignee: Kaloian Manassiev
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Backports
Depends
Related
related to DOCS-12029 Add policy checking description to mo... Closed
Backwards Compatibility: Fully Compatible
Operating System: ALL
Backport Requested:
v3.4
Sprint: Sharding 2017-07-31
Participants:

 Description   

The inclusion of the shard size exceeded check as a pre-condition for deciding whether to move chunks off a shard is a bug, which was introduced in 3.4 as part of this commit.

Such a policy has the potential to cause chunks to move off a shard and then back to it on subsequent rounds and should be removed.



 Comments   
Comment by Githook User [ 17/Jul/17 ]

Author:

{u'username': u'kaloianm', u'name': u'Kaloian Manassiev', u'email': u'kaloian.manassiev@mongodb.com'}

Message: SERVER-30058 Balancer policy should not move chunks off shards on 'size exceeded' conditions

(cherry picked from commit 411114cb3ec3119bb159b29b8ef65292e4d20de3)
Branch: v3.4
https://github.com/mongodb/mongo/commit/03f5fa3484cf001c21d48230590a97fafc1a89ec

Comment by Alyson Cabral (Inactive) [ 13/Jul/17 ]

Yes, Kal. We will continue checking the other policies with the exception of maxSize.

Comment by Githook User [ 13/Jul/17 ]

Author:

{u'username': u'kaloianm', u'name': u'Kaloian Manassiev', u'email': u'kaloian.manassiev@mongodb.com'}

Message: SERVER-30058 Balancer policy should not move chunks off shards on 'size exceeded' conditions
Branch: master
https://github.com/mongodb/mongo/commit/411114cb3ec3119bb159b29b8ef65292e4d20de3

Comment by Kaloian Manassiev [ 12/Jul/17 ]

After discussion with asya and alyson.cabral, we found out that the inclusion of the shard size exceeded check as a condition to move chunks off a shard is a bug, which was introduced in 3.4 as part of this commit. Such a policy has the potential to cause chunks to move off a shard and then back to it on subsequent rounds.

With the exception of the above condition, we decided that manual moveChunks should be performing the remaining policy checks, such as not moving to draining shards and not violating zones.

asya/alyson.cabral, can you please confirm this?

Comment by Kaloian Manassiev [ 10/Jul/17 ]

Looks like this was accidentally introduced as part of the fix for SERVER-26579 and is a bug. As part of the changes for this bug we will remove this check.

Comment by Asya Kamsky [ 10/Jul/17 ]

Why is it checking "size is not exceeded" on a manual moveChunk command?

According to our docs, maxSize is only considered by the balancer when selecting a target shard, it shouldn't be considered for manual chunk migrations, should it? https://docs.mongodb.com/manual/tutorial/manage-sharded-cluster-balancer/#change-the-maximum-storage-size-for-a-given-shard

Comment by Kaloian Manassiev [ 10/Jul/17 ]

Yes, it will try to move them to a conformant place on the next round. The conformance precedence order is:

  1. Move chunks out of draining shards
  2. Fix shard max size violations
  3. Fix zone policy violations
  4. Fix imbalance
Comment by Alyson Cabral (Inactive) [ 10/Jul/17 ]

kaloian.manassiev After making manual chunk moves, what's the behavior of the balancer? Would the balancer prioritize moving these chunks to conform with the balancer policies?

Comment by Kaloian Manassiev [ 07/Jul/17 ]

alyson.cabral - do you think this policy check is something we should continue doing or we should remove it for parity with pre-3.4?

If we end up doing SERVER-30060, this might be less of a performance problem, but I am curious as to whether this proactive policy check on manual moves is something beneficial for customers or it causes more harm than good.

Generated at Thu Feb 08 04:22:32 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.