Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-82325

Config server could invariant during balancer round

    • Fully Compatible
    • ALL
    • v7.0, v6.0, v5.0, v4.4, v4.2
    • Sharding EMEA 2023-10-30, CAR Team 2023-11-13


      In SERVER-40459 we changed the logic used by the balancer to decide which chunks to move in a specific balancer round. The new code is affected by a bug, for which it could happen that we schedule more than one migration with the same donor shard.
      When this happens, the balancer will hit an invariant and the primary of the config server will shut down, triggering a new primary election.

      Required conditions

      There are two code paths whose execution can lead to this bug and In both cases there are some necessary conditions that need to be met in order to hit the invariant.

      • Sharded cluster
      • Balancer enabled
      • At least 4 shards

      moreover, depending on the code path there are specific conditions that need to be met:

      • Shard removal
        • At least one shard being drained
        • At least one zone configured on the draining shard
        • Draining shard have at least two chunks belonging to different zones that can be moved in the same round to two different recipient shards.
          Note: chunks that are not completely contained within any of the configured zones are considered to belong to the special "no-zone".
      • Zone enforcing
        • At least two chunks residing on the same shards.
        • They belong to two different zones not associated to the shard.
        • The two chunks can be moved in the same balancer round.

      Technical description


      Affected versions

      The only releases affected by this bug are:

      • 6.0.11
      • 4.4.25

            tommaso.tocci@mongodb.com Tommaso Tocci
            tommaso.tocci@mongodb.com Tommaso Tocci
            0 Vote for this issue
            17 Start watching this issue