[SERVER-16802] Order of balancer chunk moves depends on order of config.collections Created: 12/Jan/15  Updated: 30/May/18  Resolved: 16/Mar/18

Status: Closed
Project: Core Server
Component/s: Sharding
Affects Version/s: 2.8.0-rc4
Fix Version/s: 3.4.15, 3.6.4, 3.7.4

Type: Improvement Priority: Major - P3
Reporter: Kevin Pulo Assignee: Kevin Pulo
Resolution: Done Votes: 1
Labels: balancer
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Backports
Related
Backwards Compatibility: Fully Compatible
Backport Requested:
v3.6, v3.4
Sprint: Sharding 2018-02-26, Sharding 2018-03-12, Sharding 2018-03-26
Participants:
Case:

 Description   
Background

Currently, Balancer::_doBalanceRound() and BalancerPolicy::balance() together find candidate chunks with the following pseudocode:

  • For each collection in config.collections:
    • Look for chunks on draining shards
    • Look for tag violating chunks
    • Look for imbalance within each tag
Problem

This approach means that the balancer can attempt "regular" imbalance chunk moves before tag violation chunk moves, which can be before shard drain moves. This is counter-intuitive, because users expect:

  • chunks to commence moving off draining shards very soon after removeShard has been run, and
  • tag violation chunks to be moved ahead of regular imbalance moves.
Impact

This is worsened by:

  1. A lot of imbalanced sharded collections, but a shard being drained has chunks only in collections near the end of the list (e.g. caused by tags). In this case, the user will observe many "irrelevant" non-draining chunk moves prior to (and in between) draining chunk moves (the pattern repeats each balancing round).
  2. Where earlier lower-priority moves can't complete because the TO shard can't satisfy the w:majority pre-commit check. Then each problematic earlier chunk move fails after 10 hours, causing higher-priority moves to happen a lot later. In the case of a shard drain, this can make it take a very long time, and the system to outwardly appear nearly idle (very little balance/drain/move activity). Again, this is counter-intuitive because the user believes that they have just instructed the system to start draining (but it "isn't").
Suggestion

Perhaps the code could be rearranged along the lines of:

  • For each collection in config.collections:
    • Look for chunks on draining shards
  • For each remaining collection:
    • Look for tag violating chunks
  • For each remaining collection:
    • Look for imbalance within each tag

Each collection would still have at most 1 chunk move per balance round, which should allow lower-priority moves to make progress (e.g. the balancer won't be "hogged" when draining a large shard, or when adding new tags). But it would also have the benefit of ensuring that — irrespective of the order of config.collections — draining moves are given priority over tag violation moves, which are given priority over imbalance moves.



 Comments   
Comment by Githook User [ 26/Mar/18 ]

Author:

{'email': 'kevin.pulo@mongodb.com', 'name': 'Kevin Pulo', 'username': 'devkev'}

Message: SERVER-16802 SERVER-28981 Balancer consider shards and collections in random order

(cherry picked from commit 651b3e017ce880d9ddbebb400af621c61d8c7389)
Branch: v3.4
https://github.com/mongodb/mongo/commit/af4ea84a50cf35473c77ea2b27f19c63afd43bc1

Comment by Githook User [ 19/Mar/18 ]

Author:

{'email': 'kevin.pulo@mongodb.com', 'name': 'Kevin Pulo', 'username': 'devkev'}

Message: SERVER-16802 SERVER-28981 Balancer consider shards and collections in random order

(cherry picked from commit 9850f1f190f13fb5bfd229e35d55d8fee3adc58f)
Branch: v3.6
https://github.com/mongodb/mongo/commit/651b3e017ce880d9ddbebb400af621c61d8c7389

Comment by Githook User [ 16/Mar/18 ]

Author:

{'email': 'kevin.pulo@mongodb.com', 'name': 'Kevin Pulo', 'username': 'devkev'}

Message: SERVER-16802 SERVER-28981 Balancer consider shards and collections in random order
Branch: master
https://github.com/mongodb/mongo/commit/9850f1f190f13fb5bfd229e35d55d8fee3adc58f

Comment by Kaloian Manassiev [ 16/May/16 ]

With the changes to move the balancer to the CSRS primary we split the balancer loop into a "policy" part, which returns list of all chunks, which need to be moved on that round. Currently this list is still sorted based on the collection namespace, but it would be fairly easy to make it sorted additionally first based on the draining status, then based on the tag violation status and finally on the collection name.

Generated at Thu Feb 08 03:42:19 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.