Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-29599

Balancer never relinquishes lock

    • Type: Icon: Bug Bug
    • Resolution: Works as Designed
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: 3.4.4
    • Component/s: Sharding
    • None
    • Environment:
      3.4.4 sharded cluster with 18 shards, each consisting of 1 replica, 1 primary, and 1 hidden replica. 3 config servers (CSRS) and 5 mongoS
    • ALL
    • Hide
      1. Stop the balancer
      2. Wait for the balancer to finish it's migration and stop
      3. Check locks collection for the balancer lock

      Let me know if you need more information to help reproduce. I'm not sure what you need now but I'm sure you'll need something.

      Show
      Stop the balancer Wait for the balancer to finish it's migration and stop Check locks collection for the balancer lock Let me know if you need more information to help reproduce. I'm not sure what you need now but I'm sure you'll need something.

      After upgrading our main mongo cluster from 3.2.12 to 3.4.4, we've noticed a weird behavior where the balancer never relinquishes it's lock. I can run sh.isBalancerRunning() and sh.getBalancerState(), both of which return false, but the balancer lock still shows a state of "2".
      Found using:

      db.getSiblingDB("config").locks.findOne({_id: "balancer"}).state

      I've checked the changelog collection and haven't found any evidence there that the balancer is still actually running.

      We also have had a problem for a while with moving chunks in this cluster due to mismatching index definitions on the various shards, which we are blocked from repairing due to another bug with dropping indexes which I'll log elsewhere and link to this.

      We turn off the balancer every night to do some system maintenance, and for now we've been having to manually free the balancer lock otherwise this maintenance gets stuck waiting for the balancer to finish it's migration.

      On a possibly related note, I've had to fix this balancer lock a few times in the past few days, so either some process on our end keeps re-enabling the balancer, or the lock keeps getting re-established on its own.

            Assignee:
            kaloian.manassiev@mongodb.com Kaloian Manassiev
            Reporter:
            glajchs Scott Glajch
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

              Created:
              Updated:
              Resolved: