Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-24431

collection lock not release for mongod failure

    • Type: Icon: Bug Bug
    • Resolution: Duplicate
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: 3.0.4
    • Component/s: Sharding
    • None
    • ALL

      I found a few entries that were similar to this but not quite the same.

      Configuration:
      5 shards
      3 config
      1 mongos

      A user had a runaway process that was insert way too many documents into a collection. Everything was working properly until we ran out of disk space on one of the shards. When the mongod instance on the shard went down it held a collection lock for migration. After freeing some disk space and restarting the mongod instance, sh.status() indicated that the balancer was running but chunks were not being migrated.

      After doing some reading and searching, it appeared that the problem was related to the locks. When I looked at the locks in the config database, I found that there were two locks were being held (state = 2). One on the balancer and one on a collection. The description on the collection lock, indicated that it was holding a migration lock by the shard that went down. After setting the lock state to 0 for both of these entries the balancer resumed normal operations and started to migrate chunks. I may have had to restart the mongod or some of the shards but I am not sure.

      Seems like there should be some sort of recovery for a condition when a shard fails and is holding a lock.

            Assignee:
            kelsey.schubert@mongodb.com Kelsey Schubert
            Reporter:
            bmwmaestoso bob whitehurst
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

              Created:
              Updated:
              Resolved: