Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-16647

Invariant Failure in SplitChunkCommand::run()

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major - P3
    • Resolution: Duplicate
    • Affects Version/s: 2.8.0-rc3
    • Fix Version/s: None
    • Component/s: Sharding
    • Labels:
      None
    • Environment:
      4x EC2 Ubuntu1404 m3.large
    • Operating System:
      ALL
    • Steps To Reproduce:
      Hide

      run this script in a loop on the mongos on server A after adding server's B,C,D as shards:

      db.getSiblingDB("benchdb1").dropDatabase();                                                                                                                                                                     
      sh.enableSharding("benchdb1");                                                                                                                                                                                  
      for (var i = 0; i < 64; i++) {                                                                                                                                                                                  
        sh.shardCollection("benchdb1.COL-" + i, {"shardkey": "hashed"});                                                                                                                                              
      }                                                                                                                                                                                                               

      See this github repo for details (Configuration C) https://github.com/amidvidy/mongorestore-benchmarks

      Show
      run this script in a loop on the mongos on server A after adding server's B,C,D as shards: db.getSiblingDB("benchdb1").dropDatabase(); sh.enableSharding("benchdb1"); for (var i = 0; i < 64; i++) { sh.shardCollection("benchdb1.COL-" + i, {"shardkey": "hashed"}); } See this github repo for details (Configuration C) https://github.com/amidvidy/mongorestore-benchmarks

      Description

      This happened on a 4 node cluster while running a benchmark for TOOLS-348.

      Cluster topology:
      Server A: mongos + mongorestore
      Server B,C,D: mongod (standalone shard) + mongod (config server)

      The mongod on server D crashes with:

      2014-12-23T17:44:29.556+0000 I SHARDING [conn1] distributed lock 'benchdb1.COL-1/ip-10-238-44-130:27018:1419353520:1363056040' acquired, ts : 5499a9fc9a8f659f34bfba11                                 [16/1980]
      2014-12-23T17:44:29.556+0000 I SHARDING [conn1] remotely refreshing metadata for benchdb1.COL-1 based on current shard version 1|2||5499a9f700df49298419049a, current metadata version is 1|2||5499a9f700df49298
      419049a
      2014-12-23T17:44:29.557+0000 I SHARDING [conn1] metadata of collection benchdb1.COL-1 already up to date (shard version : 1|2||5499a9f700df49298419049a, took 0ms)
      2014-12-23T17:44:29.557+0000 I SHARDING [conn1] splitChunk accepted at version 1|2||5499a9f700df49298419049a
      2014-12-23T17:44:29.864+0000 I SHARDING [conn1] about to log metadata event: { _id: "ip-10-238-44-130-2014-12-23T17:44:29-5499a9fd9a8f659f34bfba12", server: "ip-10-238-44-130", clientAddr: "10.233.133.124:414
      19", time: new Date(1419356669864), what: "split", ns: "benchdb1.COL-1", details: { before: { min: { shardkey: MinKey }, max: { shardkey: -3074457345618258602 } }, left: { min: { shardkey: MinKey }, max: { sh
      ardkey: -6148914691236517204 }, lastmod: Timestamp 1000|3, lastmodEpoch: ObjectId('5499a9f700df49298419049a') }, right: { min: { shardkey: -6148914691236517204 }, max: { shardkey: -3074457345618258602 }, last
      mod: Timestamp 1000|4, lastmodEpoch: ObjectId('5499a9f700df49298419049a') } } }
      2014-12-23T17:44:29.961+0000 I -        [conn1] Invariant failure collection src/mongo/s/d_split.cpp 842
      2014-12-23T17:44:29.984+0000 I CONTROL  [conn1] 
       0xf0bd99 0xeb5bb1 0xe9b312 0xdb85dd 0x9a9784 0x9aa5d3 0x9ab08b 0xb77c2a 0xa8af55 0x7e1770 0xec9d61 0x7f94a3128182 0x7f94a2228fbd
      ----- BEGIN BACKTRACE -----
      {"backtrace":[{"b":"400000","o":"B0BD99"},{"b":"400000","o":"AB5BB1"},{"b":"400000","o":"A9B312"},{"b":"400000","o":"9B85DD"},{"b":"400000","o":"5A9784"},{"b":"400000","o":"5AA5D3"},{"b":"400000","o":"5AB08B"
      },{"b":"400000","o":"777C2A"},{"b":"400000","o":"68AF55"},{"b":"400000","o":"3E1770"},{"b":"400000","o":"AC9D61"},{"b":"7F94A3120000","o":"8182"},{"b":"7F94A212E000","o":"FAFBD"}],"processInfo":{ "mongodbVers
      ion" : "2.8.0-rc3", "gitVersion" : "2d679247f17dab05a492c8b6d2c250dab18e54f2", "uname" : { "sysname" : "Linux", "release" : "3.13.0-36-generic", "version" : "#63-Ubuntu SMP Wed Sep 3 21:30:07 UTC 2014", "mach
      ine" : "x86_64" }, "somap" : [ { "elfType" : 2, "b" : "400000" }, { "b" : "7FFFDE3EA000", "elfType" : 3 }, { "b" : "7F94A3120000", "path" : "/lib/x86_64-linux-gnu/libpthread.so.0", "elfType" : 3 }, { "b" : "7
      F94A2F18000", "path" : "/lib/x86_64-linux-gnu/librt.so.1", "elfType" : 3 }, { "b" : "7F94A2D14000", "path" : "/lib/x86_64-linux-gnu/libdl.so.2", "elfType" : 3 }, { "b" : "7F94A2A10000", "path" : "/usr/lib/x86
      _64-linux-gnu/libstdc++.so.6", "elfType" : 3 }, { "b" : "7F94A270A000", "path" : "/lib/x86_64-linux-gnu/libm.so.6", "elfType" : 3 }, { "b" : "7F94A24F4000", "path" : "/lib/x86_64-linux-gnu/libgcc_s.so.1", "el
      fType" : 3 }, { "b" : "7F94A212E000", "path" : "/lib/x86_64-linux-gnu/libc.so.6", "elfType" : 3 }, { "b" : "7F94A333E000", "path" : "/lib64/ld-linux-x86-64.so.2", "elfType" : 3 } ] }}
       mongod(_ZN5mongo15printStackTraceERSo+0x29) [0xf0bd99]
       mongod(_ZN5mongo10logContextEPKc+0xE1) [0xeb5bb1]
       mongod(_ZN5mongo15invariantFailedEPKcS1_j+0xB2) [0xe9b312]
       mongod(_ZN5mongo17SplitChunkCommand3runEPNS_16OperationContextERKSsRNS_7BSONObjEiRSsRNS_14BSONObjBuilderEb+0x311D) [0xdb85dd]
       mongod(_ZN5mongo12_execCommandEPNS_16OperationContextEPNS_7CommandERKSsRNS_7BSONObjEiRSsRNS_14BSONObjBuilderEb+0x34) [0x9a9784]
       mongod(_ZN5mongo7Command11execCommandEPNS_16OperationContextEPS0_iPKcRNS_7BSONObjERNS_14BSONObjBuilderEb+0xC13) [0x9aa5d3]
       mongod(_ZN5mongo12_runCommandsEPNS_16OperationContextEPKcRNS_7BSONObjERNS_11_BufBuilderINS_16TrivialAllocatorEEERNS_14BSONObjBuilderEbi+0x28B) [0x9ab08b]
       mongod(_ZN5mongo8runQueryEPNS_16OperationContextERNS_7MessageERNS_12QueryMessageERNS_5CurOpES3_b+0x76A) [0xb77c2a]
       mongod(_ZN5mongo16assembleResponseEPNS_16OperationContextERNS_7MessageERNS_10DbResponseERKNS_11HostAndPortEb+0xB25) [0xa8af55]
       mongod(_ZN5mongo16MyMessageHandler7processERNS_7MessageEPNS_21AbstractMessagingPortEPNS_9LastErrorE+0xE0) [0x7e1770]
       mongod(_ZN5mongo17PortMessageServer17handleIncomingMsgEPv+0x411) [0xec9d61]
       libpthread.so.0(+0x8182) [0x7f94a3128182]
       libc.so.6(clone+0x6D) [0x7f94a2228fbd]
      -----  END BACKTRACE  -----
      2014-12-23T17:44:29.984+0000 I -        [conn1] 
       
      ***aborting after invariant() failure

        Attachments

        1. configsvr-B.log
          1.04 MB
        2. configsvr-C.log
          1.04 MB
        3. configsvr-D.log
          1.04 MB
        4. server-A-mongos.log
          1.44 MB
        5. shard-B.log
          1.22 MB
        6. shard-C.log
          1.23 MB
        7. shard-D-crashed.log
          3.36 MB

          Issue Links

            Activity

              People

              Assignee:
              spencer Spencer Brody (Inactive)
              Reporter:
              adam.midvidy Adam Midvidy
              Participants:
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved: