Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-7029

Mongos crash on replica set failover

    • Type: Icon: Bug Bug
    • Resolution: Duplicate
    • Priority: Icon: Critical - P2 Critical - P2
    • None
    • Affects Version/s: 2.2.0
    • Component/s: Sharding
    • Labels:
      None
    • ALL

      Sharded 4 node replica set with priorities 2, 1, 0, 0
      Transient failover from priority 2 to priority 1 node and back to priority 2 node causes mongos seg fault.

      https://groups.google.com/forum/?fromgroups=#!topic/mongodb-user/NeeB86n9-JU

      Just had more nodes collapse on a different column set now... here's the log with logLevel:2 turned on from mongos

      Tue Sep 11 14:03:29 [conn1510] warning: splitChunk failed - cmd: { splitChunk: "catalog.feed_data_changelog", keyPattern: { retid: 1.0, feedid: 1.0, uniqid: 1.0, version: 1.0 }, min: { retid: 13712, feedid: 2669, uniqid: "287e1d9af8a592cfe0a80aa7b80df7ba", version: 1 }, max: { retid: 13712, feedid: 2669, uniqid: "4cf79bccacf4e71d2e73607677fea6e5", version: 1 }, from: "col03", splitKeys: [ { retid: 13712, feedid: 2669, uniqid: "38623d70f41a1a433b27e0f187219eb0", version: 3 } ], shardId: "catalog.feed_data_changelog-retid_13712feedid_2669uniqid_"287e1d9af8a592cfe0a80aa7b80df7ba"version_1", configdb: "servercfg1:27019,servercfg2:27019,servercfg3:27019" } result: { who: { _id: "catalog.feed_data_changelog", process: "server01c03:27017:1346968636:1431697445", state: 2, ts: ObjectId('504f71768fb2ef42a78a0c25'), when: new Date(1347383670849), who: "server01c03:27017:1346968636:1431697445:conn62819:2040432535", why: "migrate-{ retid: 8941, feedid: 8005, uniqid: "9cd5f39b3cfaeb79e8925eef346e68d0", version: 32 }" }, errmsg: "the collection's metadata lock is taken", ok: 0.0 }
      Tue Sep 11 14:03:29 [conn1510] ChunkManager: time to load chunks for catalog.feed_data_changelog: 5ms sequenceNumber: 191 version: 227|3||504aa7463a46fa0144cf6f5e based on: 227|3||504aa7463a46fa0144cf6f5e
      Tue Sep 11 14:03:29 [conn1510] warning: chunk manager reload forced for collection 'catalog.feed_data_changelog', config version is 227|3||504aa7463a46fa0144cf6f5e
      Tue Sep 11 14:04:19 [conn1510] end connection 127.0.0.1:52194 (2 connections now open)
      Tue Sep 11 14:04:20 [mongosMain] connection accepted from 127.0.0.1:52353 #1515 (3 connections now open)
      Tue Sep 11 14:04:35 [ReplicaSetMonitorWatcher] Primary for replica set col03 changed to server02c03:27017
      Tue Sep 11 14:04:45 [ReplicaSetMonitorWatcher] Primary for replica set col03 changed to server01c03:27017
      Tue Sep 11 14:04:45 [ReplicaSetMonitorWatcher] Primary for replica set col03 changed to server02c03:27017
      Tue Sep 11 14:04:49 [WriteBackListener-server02c03:27017] DBClientCursor::init call() failed
      Tue Sep 11 14:04:49 [WriteBackListener-server02c03:27017] WriteBackListener exception : DBClientBase::findN: transport error: server02c03:27017 ns: admin.$cmd query: { writebacklisten: ObjectId('504e77846a941ccd587623c8') }
      Tue Sep 11 14:04:49 [conn1515] got not master for: server02c03:27017
      Received signal 11
      Backtrace: 0x8386d5 0x3bd7a302d0 0x2aaaab2bad80 
      /usr/bin/mongos(_ZN5mongo17printStackAndExitEi+0x75)[0x8386d5]
      /lib64/libc.so.6[0x3bd7a302d0]
      [0x2aaaab2bad80]
      

            Assignee:
            greg_10gen Greg Studer
            Reporter:
            gregor Gregor Macadam
            Votes:
            2 Vote for this issue
            Watchers:
            10 Start watching this issue

              Created:
              Updated:
              Resolved: