Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-7260

Balancer lock is not relinquished

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major - P3
    • Resolution: Gone away
    • Affects Version/s: 2.0.7, 2.2.0
    • Fix Version/s: None
    • Component/s: Sharding
    • Operating System:
      ALL

      Description

      Under certain conditions, the balancer lock may never be relinquished. One case appeasr to have occured when the balancer state was disabled during a chunk migration:

      mongos> db.locks.findOne({_id:"balancer"});
      {
              "_id" : "balancer",
              "process" : "r5.10gen.cc:27017:1349297686:1804289383",
              "state" : 2,
              "ts" : ObjectId("506cae1f13bf56db8d1b0856"),
              "when" : ISODate("2012-10-03T21:29:03.359Z"),
              "who" : "r5.10gen.cc:27017:1349297686:1804289383:Balancer:846930886",
              "why" : "doing balance round"
      }

      mongos> db.changelog.find().sort({$natural:-1}).limit(10).skip(10).pretty()
      {
              "_id" : "r5.10gen.cc-2012-10-03T21:30:05-17",
              "server" : "r5.10gen.cc",
              "clientAddr" : "127.0.0.1:57957",
              "time" : ISODate("2012-10-03T21:30:05.136Z"),
              "what" : "moveChunk.from",
              "ns" : "sh.test",
              "details" : {
                      "min" : {
                              "id" : "16540452295883480447516388304186410329865247257024"
                      },
                      "max" : {
                              "id" : "22754752024366413683521379069776306796548182491720"
                      },
                      "step1 of 6" : 0,
                      "step2 of 6" : 305,
                      "step3 of 6" : 378,
                      "step4 of 6" : 32007,
                      "step5 of 6" : 4542,
                      "step6 of 6" : 24280
              }
      }

      Note the above output was taken 15 hours after the last moveChunk was logged to the config server. It's unclear if the mongos process holding the lock was killed before it had a chance to release the lock.

      The net effect is that sh.isBalancerRunning() never returns false, even if the balancer is no longer running.

        Attachments

          Issue Links

            Activity

              People

              • Votes:
                4 Vote for this issue
                Watchers:
                11 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: