Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-17654

Crash/Exception while performing initial sync of secondary, while building a 110 Mil. docs index

    • Type: Icon: Bug Bug
    • Resolution: Duplicate
    • Priority: Icon: Critical - P2 Critical - P2
    • None
    • Affects Version/s: 3.0.0
    • Component/s: WiredTiger
    • Labels:
      None
    • Environment:
    • Linux
    • Hide

      Not sure if reproducible, what we did was.

      1. Convert a 1TB large MongoDb(3.0/WiredTiger/zlib) database to replica set primary with rs.initiate()
      2. Add an arbiter and a (empty) secondary, initial sync starts
      3. After ~16 hours, it crashed, while building the 110Mil. doc index

      Show
      Not sure if reproducible, what we did was. 1. Convert a 1TB large MongoDb(3.0/WiredTiger/zlib) database to replica set primary with rs.initiate() 2. Add an arbiter and a (empty) secondary, initial sync starts 3. After ~16 hours, it crashed, while building the 110Mil. doc index

      The replica set secondary crashed during initial sync, in the build index step, while building the index on a 110 Mil. docs large collection.

      2015-03-19T08:44:16.010+0100 I -        [rsSync]   Index Build: 75617500/113700591 66%
      2015-03-19T08:44:19.009+0100 I -        [rsSync]   Index Build: 75635700/113700591 66%
      2015-03-19T08:44:22.003+0100 I -        [rsSync]   Index Build: 75654600/113700591 66%
      2015-03-19T08:44:23.358+0100 I NETWORK  [conn3458] end connection 10.143.128.89:33110 (4 connections now open)
      2015-03-19T08:44:23.359+0100 I NETWORK  [initandlisten] connection accepted from 10.143.128.89:33118 #3460 (5 connections now open)
      2015-03-19T08:44:23.360+0100 I NETWORK  [conn3459] end connection 10.143.128.89:33111 (4 connections now open)
      2015-03-19T08:44:23.360+0100 I NETWORK  [initandlisten] connection accepted from 10.143.128.89:33119 #3461 (5 connections now open)
      2015-03-19T08:44:23.394+0100 I ACCESS   [conn3460] Successfully authenticated as principal __system on local
      2015-03-19T08:44:23.395+0100 I ACCESS   [conn3461] Successfully authenticated as principal __system on local
      2015-03-19T08:44:25.010+0100 I -        [rsSync]   Index Build: 75660700/113700591 66%
      2015-03-19T08:44:38.168+0100 I -        [rsSync]   Index Build: 75690900/113700591 66%
      2015-03-19T08:44:38.354+0100 E STORAGE  [rsSync] WiredTiger (0) [1426751078:349853][12207:0x7fb66ec0a700], file:flowdev/collection-173-1201567201290982410.wt, cursor.next: encountered an illegal file format or internal value
      2015-03-19T08:44:38.354+0100 E STORAGE  [rsSync] WiredTiger (-31804) [1426751078:354864][12207:0x7fb66ec0a700], file:flowdev/collection-173-1201567201290982410.wt, cursor.next: the process must exit and restart: WT_PANIC: WiredTiger library panic
      2015-03-19T08:44:38.354+0100 I -        [rsSync] Fatal Assertion 28558
      2015-03-19T08:44:38.406+0100 I CONTROL  [rsSync]
       0xf58429 0xef69c1 0xedb021 0xd63086 0x1382250 0x1382515 0x13829b1 0x12e43c1 0x1325329 0xd69463 0xd69512 0x9f3092 0xbce524 0xbce8d4 0x91dbc0 0x929fb4 0xc8256b 0xc83044 0xc83e82 0xc8c5f1 0xfa5f44 0x7fb67c16b9d1 0x7fb67acbd8fd
      ----- BEGIN BACKTRACE -----
      {"backtrace":[{"b":"400000","o":"B58429"},{"b":"400000","o":"AF69C1"},{"b":"400000","o":"ADB021"},{"b":"400000","o":"963086"},{"b":"400000","o":"F82250"},{"b":"400000","o":"F82515"},{"b":"400000","o":"F829B1"},{"b":"400000","o":"EE43C1"},{"b":"400000","o":"F25329"},{"b":"400000","o":"969463"},{"b":"400000","o":"969512"},{"b":"400000","o":"5F3092"},{"b":"400000","o":"7CE524"},{"b":"400000","o":"7CE8D4"},{"b":"400000","o":"51DBC0"},{"b":"400000","o":"529FB4"},{"b":"400000","o":"88256B"},{"b":"400000","o":"883044"},{"b":"400000","o":"883E82"},{"b":"400000","o":"88C5F1"},{"b":"400000","o":"BA5F44"},{"b":"7FB67C164000","o":"79D1"},{"b":"7FB67ABD5000","o":"E88FD"}],"processInfo":{ "mongodbVersion" : "3.0.0", "gitVersion" : "a841fd6394365954886924a35076691b4d149168", "uname" : { "sysname" : "Linux", "release" : "2.6.32-504.3.3.el6.x86_64", "version" : "#1 SMP Fri Dec 12 16:05:43 EST 2014", "machine" : "x86_64" }, "somap" : [ { "elfType" : 2, "b" : "400000", "buildId" : "FFFD21B8C7EC4ADC196E65832A0B272803A7A4F5" }, { "b" : "7FFF22DCE000", "elfType" : 3, "buildId" : "E752C57E2BD5883E5CE1211B21FC5859B4520D90" }, { "b" : "7FB67C164000", "path" : "/lib64/libpthread.so.0", "elfType" : 3, "buildId" : "A35053D76A6B7BD91D2EE58CC024D8EF697CE977" }, { "b" : "7FB67BEF8000", "path" : "/usr/lib64/libssl.so.10", "elfType" : 3, "buildId" : "58B33C1A58DAD354D36CB87FD14997F06BF1497D" }, { "b" : "7FB67BB15000", "path" : "/usr/lib64/libcrypto.so.10", "elfType" : 3, "buildId" : "71BC917ECEB443B79853AC793482A6BE9D468BC4" }, { "b" : "7FB67B90D000", "path" : "/lib64/librt.so.1", "elfType" : 3, "buildId" : "69BCB2B5FE6D85ACD898362EAC5EE79857DA4EC4" }, { "b" : "7FB67B709000", "path" : "/lib64/libdl.so.2", "elfType" : 3, "buildId" : "266172B083F783BD94389BE55B0B371C17198268" }, { "b" : "7FB67B403000", "path" : "/usr/lib64/libstdc++.so.6", "elfType" : 3, "buildId" : "ED99110E629209C5CA6C0ED704F2C5CE3171513A" }, { "b" : "7FB67B17F000", "path" : "/lib64/libm.so.6", "elfType" : 3, "buildId" : "A5F11596A3C6C24C2304A996DE37A1736C5715F9" }, { "b" : "7FB67AF69000", "path" : "/lib64/libgcc_s.so.1", "elfType" : 3, "buildId" : "A44499D29B114A5366CD72DD4883958495AC1C1D" }, { "b" : "7FB67ABD5000", "path" : "/lib64/libc.so.6", "elfType" : 3, "buildId" : "22AA38CCA59A5DF6CF07B8FC1778E2EE0384508E" }, { "b" : "7FB67C381000", "path" : "/lib64/ld-linux-x86-64.so.2", "elfType" : 3, "buildId" : "5BEB2450B75E84FF317C65F22AF8B8112C25DF63" }, { "b" : "7FB67A991000", "path" : "/lib64/libgssapi_krb5.so.2", "elfType" : 3, "buildId" : "EFF68B7DE77D081BC4A0CB38FE9DCBC60541BF92" }, { "b" : "7FB67A6AB000", "path" : "/lib64/libkrb5.so.3", "elfType" : 3, "buildId" : "95EBB74C2C0A1E1714344036145A0239FFA4892D" }, { "b" : "7FB67A4A7000", "path" : "/lib64/libcom_err.so.2", "elfType" : 3, "buildId" : "6ADE12F76961F73B33D160AC4D342222E7FC7A65" }, { "b" : "7FB67A27B000", "path" : "/lib64/libk5crypto.so.3", "elfType" : 3, "buildId" : "D02E7D3149950118009A81997434E28B7D9EC9B2" }, { "b" : "7FB67A065000", "path" : "/lib64/libz.so.1", "elfType" : 3, "buildId" : "D053BB4FF0C2FC983842F81598813B9B931AD0D1" }, { "b" : "7FB679E5A000", "path" : "/lib64/libkrb5support.so.0", "elfType" : 3, "buildId" : "5AFCBEA0D62EE0335714CCBAB7BA808E2A16028C" }, { "b" : "7FB679C57000", "path" : "/lib64/libkeyutils.so.1", "elfType" : 3, "buildId" : "8A8734DC37305D8CC2EF8F8C3E5EA03171DB07EC" }, { "b" : "7FB679A3D000", "path" : "/lib64/libresolv.so.2", "elfType" : 3, "buildId" : "C6DC308333BCC5E1DDD2A308F47AB0BFA318D1CC" }, { "b" : "7FB67981E000", "path" : "/lib64/libselinux.so.1", "elfType" : 3, "buildId" : "BAD5C71361DADF259B6E306A49E6F47F24AEA3DC" } ] }}
       mongod(_ZN5mongo15printStackTraceERSo+0x29) [0xf58429]
       mongod(_ZN5mongo10logContextEPKc+0xE1) [0xef69c1]
       mongod(_ZN5mongo13fassertFailedEi+0x61) [0xedb021]
       mongod(+0x963086) [0xd63086]
       mongod(+0xF82250) [0x1382250]
       mongod(__wt_err+0x95) [0x1382515]
       mongod(__wt_panic+0x21) [0x13829b1]
       mongod(__wt_btcur_next+0x2891) [0x12e43c1]
       mongod(+0xF25329) [0x1325329]
       mongod(_ZN5mongo21WiredTigerRecordStore8Iterator8_getNextEv+0x73) [0xd69463]
       mongod(_ZN5mongo21WiredTigerRecordStore8Iterator7getNextEv+0x12) [0xd69512]
       mongod(_ZN5mongo14CollectionScan4workEPm+0x2B2) [0x9f3092]
       mongod(_ZN5mongo12PlanExecutor18getNextSnapshottedEPNS_11SnapshottedINS_7BSONObjEEEPNS_8RecordIdE+0xA4) [0xbce524]
       mongod(_ZN5mongo12PlanExecutor7getNextEPNS_7BSONObjEPNS_8RecordIdE+0x34) [0xbce8d4]
       mongod(_ZN5mongo15MultiIndexBlock30insertAllDocumentsInCollectionEPSt3setINS_8RecordIdESt4lessIS2_ESaIS2_EE+0x130) [0x91dbc0]
       mongod(_ZN5mongo6Cloner2goEPNS_16OperationContextERKSsS4_RKNS_12CloneOptionsEPSt3setISsSt4lessISsESaISsEERSsPi+0xD04) [0x929fb4]
       mongod(+0x88256B) [0xc8256b]
       mongod(+0x883044) [0xc83044]
       mongod(_ZN5mongo4repl17syncDoInitialSyncEv+0x42) [0xc83e82]
       mongod(_ZN5mongo4repl13runSyncThreadEv+0x181) [0xc8c5f1]
       mongod(+0xBA5F44) [0xfa5f44]
       libpthread.so.0(+0x79D1) [0x7fb67c16b9d1]
       libc.so.6(clone+0x6D) [0x7fb67acbd8fd]
      -----  END BACKTRACE  -----
      2015-03-19T08:44:38.406+0100 I -        [rsSync]
      
      ***aborting after fassert() failure
      

            Assignee:
            michael.cahill@mongodb.com Michael Cahill (Inactive)
            Reporter:
            bhcoba Borut Hadzialic
            Votes:
            0 Vote for this issue
            Watchers:
            10 Start watching this issue

              Created:
              Updated:
              Resolved: