Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-11694

Balancer not working

    XMLWordPrintableJSON

Details

    • Icon: Bug Bug
    • Resolution: Done
    • Icon: Blocker - P1 Blocker - P1
    • None
    • 2.4.1
    • Sharding
    • None
    • ALL

    Description

      I have a two shard setup (shard0, shard1) on AWS. I went ahead and added a new shard (shard2). However the balancer is not able to move chunks to the new shards

      I see a bunch of errors in the log - "migration already in progress"
      Thu Nov 14 00:29:50.406 [Balancer] ns: chat.chats_development going to move { _id: "chat.chats_development-target_pid_MinKey", lastmod: Timestamp 2000|2, la
      stmodEpoch: ObjectId('5282724e57add7686f91ac12'), ns: "chat.chats_development", min:

      { target_pid: MinKey }

      , max:

      { target_pid: -4611686018427387902 }

      , shard
      : "Shard-0" } from: Shard-0 to: Shard-2 tag []
      Thu Nov 14 00:29:50.408 [mongosMain] connection accepted from 10.84.154.211:43907 #24556 (34 connections now open)
      Thu Nov 14 00:29:50.408 [conn24556] end connection 10.84.154.211:43907 (33 connections now open)
      Thu Nov 14 00:29:50.409 [Balancer] ns: chat.chats_staging going to move { _id: "chat.chats_staging-target_pid_MinKey", lastmod: Timestamp 2000|2, lastmodEpo
      ch: ObjectId('5282725857add7686f91ac14'), ns: "chat.chats_staging", min:

      { target_pid: MinKey }

      , max:

      { target_pid: -4611686018427387902 }

      , shard: "Shard-0"
      } from: Shard-0 to: Shard-2 tag []
      Thu Nov 14 00:29:50.409 [Balancer] moving chunk ns: chat.chats_production moving ( ns:chat.chats_productionshard: Shard-0:Shard-0/SG-m1largetest2-1522.server
      s.mongodirector.com:27017,SG-m1largetest2-1523.servers.mongodirector.com:27017lastmod: 3|8||000000000000000000000000min:

      { target_pid: -6914850372026113762 }

      max:

      { target_pid: -5776441268022973039 }

      ) Shard-0:Shard-0/SG-m1largetest2-1522.servers.mongodirector.com:27017,SG-m1largetest2-1523.servers.mongodirector.co
      m:27017 -> Shard-2:Shard-2/SG-m1largetest2-1534.servers.mongodirector.com:27017,SG-m1largetest2-1535.servers.mongodirector.com:27017
      Thu Nov 14 00:29:50.409 [mongosMain] connection accepted from 10.119.39.164:56401 #24557 (34 connections now open)
      Thu Nov 14 00:29:50.411 [conn24557] end connection 10.119.39.164:56401 (33 connections now open)
      Thu Nov 14 00:29:50.422 [mongosMain] connection accepted from 10.78.134.95:53117 #24558 (34 connections now open)
      Thu Nov 14 00:29:50.422 [conn24558] end connection 10.78.134.95:53117 (33 connections now open)
      Thu Nov 14 00:29:50.425 [mongosMain] connection accepted from 10.37.15.212:33571 #24559 (34 connections now open)
      Thu Nov 14 00:29:50.426 [conn24559] end connection 10.37.15.212:33571 (33 connections now open)
      Thu Nov 14 00:29:50.507 [Balancer] moveChunk result:

      { ok: 0.0, errmsg: "migration already in progress" }

      Thu Nov 14 00:29:50.508 [Balancer] balancer move failed:

      { ok: 0.0, errmsg: "migration already in progress" }

      from: Shard-0 to: Shard-2 chunk: min:

      { target _pid: -6914850372026113762 }

      max:

      { target_pid: -5776441268022973039 }

      Thu Nov 14 00:29:50.508 [Balancer] moving chunk ns: chat.chats_development moving ( ns:chat.chats_developmentshard: Shard-0:Shard-0/SG-m1largetest2-1522.serv
      ers.mongodirector.com:27017,SG-m1largetest2-1523.servers.mongodirector.com:27017lastmod: 2|2||000000000000000000000000min:

      { target_pid: MinKey }

      max:

      { targe t_pid: -4611686018427387902 }

      ) Shard-0:Shard-0/SG-m1largetest2-1522.servers.mongodirector.com:27017,SG-m1largetest2-1523.servers.mongodirector.com:27017 -> S
      hard-2:Shard-2/SG-m1largetest2-1534.servers.mongodirector.com:27017,SG-m1largetest2-1535.servers.mongodirector.com:27017
      Thu Nov 14 00:29:50.575 [mongosMain] connection accepted from 10.98.227.148:37905 #24560 (34 connections now open)
      Thu Nov 14 00:29:50.577 [conn24560] end connection 10.98.227.148:37905 (33 connections now open)
      Thu Nov 14 00:29:50.577 [mongosMain] connection accepted from 10.12.191.175:34776 #24561 (34 connections now open)
      Thu Nov 14 00:29:50.579 [conn24561] end connection 10.12.191.175:34776 (33 connections now open)
      Thu Nov 14 00:29:50.613 [Balancer] moveChunk result:

      { ok: 0.0, errmsg: "migration already in progress" }

      Thu Nov 14 00:29:50.614 [Balancer] balancer move failed:

      { ok: 0.0, errmsg: "migration already in progress" }

      from: Shard-0 to: Shard-2 chunk: min:

      { target _pid: MinKey }

      max:

      { target_pid: -4611686018427387902 }

      Thu Nov 14 00:29:50.614 [Balancer] moving chunk ns: chat.chats_staging moving ( ns:chat.chats_stagingshard: Shard-0:Shard-0/SG-m1largetest2-1522.servers.mong
      odirector.com:27017,SG-m1largetest2-1523.servers.mongodirector.com:27017lastmod: 2|2||000000000000000000000000min:

      { target_pid: MinKey }

      max:

      { target_pid: - 4611686018427387902 }

      ) Shard-0:Shard-0/SG-m1largetest2-1522.servers.mongodirector.com:27017,SG-m1largetest2-1523.servers.mongodirector.com:27017 -> Shard-2:S
      hard-2/SG-m1largetest2-1534.servers.mongodirector.com:27017,SG-m1largetest2-1535.servers.mongodirector.com:27017
      Thu Nov 14 00:29:50.718 [Balancer] moveChunk result:

      { ok: 0.0, errmsg: "migration already in progress" }

      Thu Nov 14 00:29:50.719 [Balancer] balancer move failed:

      { ok: 0.0, errmsg: "migration already in progress" }

      from: Shard-0 to: Shard-2 chunk: min:

      { target _pid: MinKey }

      max:

      { target_pid: -4611686018427387902 }

      Here is the output of locks collection
      mongos> db.locks.find();

      { "_id" : "configUpgrade", "process" : "ip-10-157-51-190:27017:1384231658:1804289383", "state" : 0, "ts" : ObjectId("5281b2eab4966fecfd6e633d"), "when" : ISODate("2013-11-12T04:47:38.349Z"), "who" : "ip-10-157-51-190:27017:1384231658:1804289383:mongosMain:846930886", "why" : "upgrading config database to new format v4" } { "_id" : "balancer", "process" : "ip-10-157-51-190:27017:1384387524:1804289383", "state" : 0, "ts" : ObjectId("52841a15bac893571dc4c0b7"), "when" : ISODate("2013-11-14T00:32:21.869Z"), "who " : "ip-10-157-51-190:27017:1384387524:1804289383:Balancer:846930886", "why" : "doing balance round" }

      { "_id" : "chat.chats_production", "process" : "ip-10-83-63-165:27017:1384280537:396106300", "state" : 2, "ts" : ObjectId("5283de1fd365b49a9f02596d"), "when" : ISODate("2013-11-13T20:16:31.1
      92Z"), "who" : "ip-10-83-63-165:27017:1384280537:396106300:conn9740:778452206", "why" : "migrate-

      { target_pid: -6914850372026113762 }

      " }
      { "_id" : "chat.chats_development", "process" : "ip-10-181-159-145:27017:1384280542:1450290529", "state" : 0, "ts" : ObjectId("5282725074d386569e867f12"), "when" : ISODate("2013-11-12T18:24:
      16.705Z"), "who" : "ip-10-181-159-145:27017:1384280542:1450290529:conn35:1960032425", "why" : "split-

      { target_pid: 0 }

      " }
      { "_id" : "chat.chats_staging", "process" : "ip-10-181-159-145:27017:1384280542:1450290529", "state" : 0, "ts" : ObjectId("5282725a74d386569e867f15"), "when" : ISODate("2013-11-12T18:24:26.3
      89Z"), "who" : "ip-10-181-159-145:27017:1384280542:1450290529:conn35:1960032425", "why" : "split-

      { target_pid: 0 }

      " }

      Attachments

        Activity

          People

            Unassigned Unassigned
            dharshanr@scalegrid.net Dharshan Rangegowda
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: