Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-14389

segmentation fault in RangeDeleter::canEnqueue_inlock

    • Type: Icon: Bug Bug
    • Resolution: Cannot Reproduce
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: 2.6.1
    • Component/s: Sharding
    • Labels:
      None
    • ALL

      One of our mongod replicaset member, which is part of a cluster consisting of 3 shards, went down due to a segmentation fault:

      2014-06-29T22:14:06.767+0200 [conn199059]  ntoskip:0 ntoreturn:1
      2014-06-29T22:14:06.767+0200 [conn199059] stale version detected during query over offerStore.$cmd : { $err: "[offerStore.offer] shard version not ok in Client::Context: version mismatch detected for offerStore.offer, stored major version 20016 does not match ...", code: 13388, ns: "offerStore.offer", vReceived: Timestamp 20015000|85, vReceivedEpoch: ObjectId('538f1c07b86632c2d721e203'), vWanted: Timestamp 20016000|0, vWantedEpoch: ObjectId('538f1c07b86632c2d721e203') }
      2014-06-29T22:14:06.767+0200 [conn198333] end connection 172.16.65.202:43434 (1166 connections now open)
      2014-06-29T22:14:06.767+0200 [conn199059] end connection 172.16.65.202:43728 (1165 connections now open)
      2014-06-29T22:14:06.812+0200 [conn199123] moveChunk migrate commit accepted by TO-shard: { active: false, ns: "offerStore.offer", from: "offerStoreUK/s128:27017,s137:27017,s227:27017", min: { _id: 99144222 }, max: { _id: 129281657 }, shardKeyPattern: { _id: 1.0 }, state: "done", counts: { cloned: 2843, clonedBytes: 3969903, catchup: 0, steady: 0 }, ok: 1.0 }
      2014-06-29T22:14:06.812+0200 [conn199123] moveChunk updating self version to: 20016|1||538f1c07b86632c2d721e203 through { _id: 129281657 } -> { _id: 131845582 } for collection 'offerStore.offer'
      2014-06-29T22:14:06.812+0200 [conn199123] SyncClusterConnection connecting to [sx350:20019]
      2014-06-29T22:14:06.814+0200 [conn199123] SyncClusterConnection connecting to [sx351:20019]
      2014-06-29T22:14:06.816+0200 [conn199123] SyncClusterConnection connecting to [sx352:20019]
      2014-06-29T22:14:07.147+0200 [conn199123] about to log metadata event: { _id: "s128-2014-06-29T20:14:07-53b0738fe4db6482ab714a67", server: "s128", clientAddr: "172.16.65.202:43756", time: new Date(1404072847147), what: "moveChunk.commit", ns: "offerStore.offer", details: { min: { _id: 99144222 }, max: { _id: 129281657 }, from: "offerStoreUK", to: "offerStoreUK3", cloned: 2843, clonedBytes: 3969903, catchup: 0, steady: 0 } }
      2014-06-29T22:14:07.337+0200 [conn201544] end connection 172.16.64.98:54303 (1164 connections now open)
      2014-06-29T22:14:07.338+0200 [initandlisten] connection accepted from 172.16.64.98:54305 #201548 (1165 connections now open)
      2014-06-29T22:14:07.339+0200 [conn201548]  authenticate db: local { authenticate: 1, nonce: "xxx", user: "__system", key: "xxx" }
      2014-06-29T22:14:07.350+0200 [conn199123] MigrateFromStatus::done About to acquire global write lock to exit critical section
      2014-06-29T22:14:07.350+0200 [conn199123] MigrateFromStatus::done Global lock acquired
      2014-06-29T22:14:07.361+0200 [conn199123] doing delete inline for cleanup of chunk data
      2014-06-29T22:14:07.361+0200 [conn199123] SEVERE: Invalid access at address: 0
      2014-06-29T22:14:07.460+0200 [conn199123] SEVERE: Got signal: 11 (Segmentation fault).
      Backtrace:0x11c0e91 0x11c026e 0x11c035f 0x7f6a68197030 0xdd7996 0xdd96cc 0xdd9c1a 0xfd21a3 0xa1e85a 0xa1f8ce 0xa21086 0xd4dae7 0xb97322 0xb99902 0x76b6af 0x117720b 0x7f6a6818eb50 0x7f6a675320ed
       /usr/bin/mongod(_ZN5mongo15printStackTraceERSo+0x21) [0x11c0e91]
       /usr/bin/mongod() [0x11c026e]
       /usr/bin/mongod() [0x11c035f]
       /lib/x86_64-linux-gnu/libpthread.so.0(+0xf030) [0x7f6a68197030]
       /usr/bin/mongod(_ZNK5mongo12RangeDeleter11NSMinMaxCmpclEPKNS0_8NSMinMaxES4_+0x26) [0xdd7996]
       /usr/bin/mongod(_ZNK5mongo12RangeDeleter17canEnqueue_inlockERKNS_10StringDataERKNS_7BSONObjES6_PSs+0x1fc) [0xdd96cc]
       /usr/bin/mongod(_ZN5mongo12RangeDeleter9deleteNowERKSsRKNS_7BSONObjES5_S5_bPSs+0x22a) [0xdd9c1a]
       /usr/bin/mongod(_ZN5mongo16MoveChunkCommand3runERKSsRNS_7BSONObjEiRSsRNS_14BSONObjBuilderEb+0xea73) [0xfd21a3]
       /usr/bin/mongod(_ZN5mongo12_execCommandEPNS_7CommandERKSsRNS_7BSONObjEiRSsRNS_14BSONObjBuilderEb+0x3a) [0xa1e85a]
       /usr/bin/mongod(_ZN5mongo7Command11execCommandEPS0_RNS_6ClientEiPKcRNS_7BSONObjERNS_14BSONObjBuilderEb+0xd5e) [0xa1f8ce]
       /usr/bin/mongod(_ZN5mongo12_runCommandsEPKcRNS_7BSONObjERNS_11_BufBuilderINS_16TrivialAllocatorEEERNS_14BSONObjBuilderEbi+0x6c6) [0xa21086]
       /usr/bin/mongod(_ZN5mongo11newRunQueryERNS_7MessageERNS_12QueryMessageERNS_5CurOpES1_+0x2307) [0xd4dae7]
       /usr/bin/mongod() [0xb97322]
       /usr/bin/mongod(_ZN5mongo16assembleResponseERNS_7MessageERNS_10DbResponseERKNS_11HostAndPortE+0x442) [0xb99902]
       /usr/bin/mongod(_ZN5mongo16MyMessageHandler7processERNS_7MessageEPNS_21AbstractMessagingPortEPNS_9LastErrorE+0x9f) [0x76b6af]
       /usr/bin/mongod(_ZN5mongo17PortMessageServer17handleIncomingMsgEPv+0x4fb) [0x117720b]
       /lib/x86_64-linux-gnu/libpthread.so.0(+0x6b50) [0x7f6a6818eb50]
       /lib/x86_64-linux-gnu/libc.so.6(clone+0x6d) [0x7f6a675320ed]
      

      We are running mongodb-linux-x86_64-2.6.1.
      It might be related to a this issue:
      https://jira.mongodb.org/browse/SERVER-14261

        1. chunkMove.JPG
          65 kB
          Kay Agahd

            Assignee:
            ger.hartnett@mongodb.com Ger Hartnett
            Reporter:
            kay.agahd@idealo.de Kay Agahd
            Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

              Created:
              Updated:
              Resolved: