Under certain conditions, the balancer lock may never be relinquished. One case appeasr to have occured when the balancer state was disabled during a chunk migration:
mongos> db.locks.findOne({_id:"balancer"}); { "_id" : "balancer", "process" : "r5.10gen.cc:27017:1349297686:1804289383", "state" : 2, "ts" : ObjectId("506cae1f13bf56db8d1b0856"), "when" : ISODate("2012-10-03T21:29:03.359Z"), "who" : "r5.10gen.cc:27017:1349297686:1804289383:Balancer:846930886", "why" : "doing balance round" }
mongos> db.changelog.find().sort({$natural:-1}).limit(10).skip(10).pretty() { "_id" : "r5.10gen.cc-2012-10-03T21:30:05-17", "server" : "r5.10gen.cc", "clientAddr" : "127.0.0.1:57957", "time" : ISODate("2012-10-03T21:30:05.136Z"), "what" : "moveChunk.from", "ns" : "sh.test", "details" : { "min" : { "id" : "16540452295883480447516388304186410329865247257024" }, "max" : { "id" : "22754752024366413683521379069776306796548182491720" }, "step1 of 6" : 0, "step2 of 6" : 305, "step3 of 6" : 378, "step4 of 6" : 32007, "step5 of 6" : 4542, "step6 of 6" : 24280 } }
Note the above output was taken 15 hours after the last moveChunk was logged to the config server. It's unclear if the mongos process holding the lock was killed before it had a chance to release the lock.
The net effect is that sh.isBalancerRunning() never returns false, even if the balancer is no longer running.
- related to
-
SERVER-14996 Locks not released if config server goes down / balancer operation and moveChunk command stop working
- Closed