Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-3517

Received signal 6 and mongos crashed

    • Type: Icon: Bug Bug
    • Resolution: Done
    • Priority: Icon: Critical - P2 Critical - P2
    • 1.9.2
    • Affects Version/s: 1.8.2, 1.8.3
    • Component/s: Sharding
    • Labels:
      None
    • Environment:
      /usr/bin/mongos db version v1.8.3-rc0, pdfile version 4.5 starting
    • Linux

      I setup 3 config servers , 3 mongod servers for one replicaset and 2 mongos. After running a sequence of operation(create collection -> enablesharding -> insert -> findandmodify / query / delete collection) several times, mongos received signal 6 and crashed. I ran into this error both on 1.8.2 and 1.8.3 rc0.

      log snippet attached as following:

      Tue Aug 2 08:18:39 [conn83] CMD: shardcollection: { shardcollection: "dummy.coll_5", unique: false, key:

      { shardkey1: 1, shardkey2: 1 }

      }
      Tue Aug 2 08:18:39 [conn83] enable sharding on: dummy.coll_5 with shard key:

      { shardkey1: 1, shardkey2: 1 }

      Tue Aug 2 08:18:39 [conn83] about to create first chunk for: dummy.coll_5
      Tue Aug 2 08:18:39 [conn83] successfully created first chunk for ns:dummy.coll_5 at: dal2-ts-shard10_dal2-ts-shard11_dal2-ts-shard12:dal2-ts-shard10_dal2-ts-shard11_dal2-ts-shard12/dal2-ts-shard10:27018,dal2-ts-shard12:27018,dal2-ts-shard11:27018 lastmod: 1|0 min:

      { shardkey1: MinKey, shardkey2: MinKey }

      max:

      { shardkey1: MaxKey, shardkey2: MaxKey }

      Tue Aug 2 08:18:41 [mongosMain] connection accepted from 172.16.0.18:48305 #135
      Tue Aug 2 08:18:42 [mongosMain] connection accepted from 172.16.0.2:22086 #136
      Tue Aug 2 08:18:43 [mongosMain] connection accepted from 172.16.0.18:48306 #137
      Tue Aug 2 08:18:44 [mongosMain] connection accepted from 172.16.0.2:22087 #138
      Tue Aug 2 08:18:45 [conn96] ns: dummy.coll_5 ClusteredCursor::query ShardConnection had to change attempt: 0
      Tue Aug 2 08:18:57 [conn93] delete failed b/c of StaleConfigException, retrying left:4 ns: dummy.coll_5 patt: {}
      Tue Aug 2 08:19:00 [mongosMain] connection accepted from 172.16.0.2:22089 #139
      Tue Aug 2 08:19:02 [conn95] update failed b/c of StaleConfigException, retrying left:4 ns: dummy.coll_5 query:

      { b: 2, shardkey2: "test1", shardkey1: "test1" }

      Tue Aug 2 08:19:05 [mongosMain] connection accepted from 172.16.0.18:48307 #140
      Tue Aug 2 08:19:15 [LockPinger] dist_lock pinged successfully for: dal2-ts-aggr02:1312268368:1804289383
      Tue Aug 2 08:19:16 [mongosMain] connection accepted from 127.0.0.1:46010 #141
      Tue Aug 2 08:19:16 [conn141] end connection 127.0.0.1:46010
      Tue Aug 2 08:19:32 [conn109] DROP: dummy.coll_5
      Tue Aug 2 08:19:32 [conn109] about to log metadata event: { _id: "dal2-ts-aggr02-2011-08-02T08:19:32-20", server: "dal2-ts-aggr02", clientAddr: "N/A", time: new Date(1312273172134), what: "dropCollection.start", ns: "dummy.coll_5", details: {} }
      Tue Aug 2 08:19:32 [conn109] about to log metadata event: { _id: "dal2-ts-aggr02-2011-08-02T08:19:32-21", server: "dal2-ts-aggr02", clientAddr: "N/A", time: new Date(1312273172394), what: "dropCollection", ns: "dummy.coll_5", details: {} }

      Tue Aug 2 08:42:08 [conn83] enable sharding on: dummy.coll_3 with shard key:

      { _id: 1 }

      Tue Aug 2 08:42:08 [conn83] about to create first chunk for: dummy.coll_3
      Tue Aug 2 08:42:08 [conn83] successfully created first chunk for ns:dummy.coll_3 at: dal2-ts-shard10_dal2-ts-shard11_dal2-ts-shard12:dal2-ts-shard10_dal2-ts-shard11_dal2-ts-shard12/dal2-ts-shard10:27018,dal2-ts-shard12:27018,dal2-ts-shard11:27018 lastmod: 1|0 min:

      { _id: MinKey }

      max:

      { _id: MaxKey }

      Tue Aug 2 08:42:15 [mongosMain] connection accepted from 172.16.0.18:42693 #153
      Tue Aug 2 08:42:17 [mongosMain] connection accepted from 172.16.0.18:42694 #154
      Tue Aug 2 08:42:27 [conn93] AssertionException in process: ns: dummy.coll_5 doWRite
      Tue Aug 2 08:42:29 [conn94] AssertionException in process: ns: dummy.coll_5 doWRite
      Tue Aug 2 08:42:32 [mongosMain] connection accepted from 172.16.0.2:57837 #155
      Tue Aug 2 08:42:34 [mongosMain] connection accepted from 172.16.0.2:57838 #156
      Tue Aug 2 08:42:35 [conn96] AssertionException in process: ns: dummy.coll_5 doWRite
      Tue Aug 2 08:42:36 [conn97] AssertionException in process: ns: dummy.coll_5 doWRite
      Tue Aug 2 08:42:39 [mongosMain] connection accepted from 172.16.0.18:42695 #157
      Received signal 6
      Backtrace: 0x52f545 0x3fa54302d0 0x3fa5430265 0x3fa5431d10 0x3fa54296e6 0x69e32f 0x50419b 0x505a54 0x6a76e0 0x3fa60064a7 0x3fa54d3c2d
      /usr/bin/mongos(_ZN5mongo17printStackAndExitEi+0x75)[0x52f545]
      /lib64/libc.so.6[0x3fa54302d0]
      /lib64/libc.so.6(gsignal+0x35)[0x3fa5430265]
      /lib64/libc.so.6(abort+0x110)[0x3fa5431d10]
      /lib64/libc.so.6(__assert_fail+0xf6)[0x3fa54296e6]
      /usr/bin/mongos(_ZN5mongo17WriteBackListener3runEv+0x170f)[0x69e32f]
      /usr/bin/mongos(_ZN5mongo13BackgroundJob7jobBodyEN5boost10shared_ptrINS0_9JobStatusEEE+0x12b)[0x50419b]
      /usr/bin/mongos(_ZN5boost6detail11thread_dataINS_3_bi6bind_tIvNS_4_mfi3mf1IvN5mongo13BackgroundJobENS_10shared_ptrINS7_9JobStatusEEEEENS2_5list2INS2_5valueIPS7_EENSD_ISA_EEEEEEE3runEv+0x74)[0x505a54]
      /usr/bin/mongos(thread_proxy+0x80)[0x6a76e0]
      /lib64/libpthread.so.0[0x3fa60064a7]
      /lib64/libc.so.6(clone+0x6d)[0x3fa54d3c2d]
      ===

            Assignee:
            greg_10gen Greg Studer
            Reporter:
            edwardwei Edward Wei
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

              Created:
              Updated:
              Resolved: