[SERVER-17654] Crash/Exception while performing initial sync of secondary, while building a 110 Mil. docs index Created: 19/Mar/15  Updated: 04/Jun/15  Resolved: 21/Apr/15

Status: Closed
Project: Core Server
Component/s: WiredTiger
Affects Version/s: 3.0.0
Fix Version/s: None

Type: Bug Priority: Critical - P2
Reporter: Borut Hadzialic Assignee: Michael Cahill (Inactive)
Resolution: Duplicate Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:
  • Linux 2.6.32-504.3.3.el6.x86_64 #1 SMP Fri Dec 12 16:05:43 EST 2014 x86_64 x86_64 x86_64 GNU/Linux
  • mongodb-linux-x86_64-rhel62-3.0.0 (WiredTiger / zlib)
  • 196GB Ram / SAN storage (with SAS spinning disks inside)
  • ~1TB data (uncompressed), most of it in a 110Mil. collection

Issue Links:
Duplicate
duplicates SERVER-17713 WiredTiger using zlib compression can... Closed
Related
is related to SERVER-17713 WiredTiger using zlib compression can... Closed
Operating System: Linux
Steps To Reproduce:

Not sure if reproducible, what we did was.

1. Convert a 1TB large MongoDb(3.0/WiredTiger/zlib) database to replica set primary with rs.initiate()
2. Add an arbiter and a (empty) secondary, initial sync starts
3. After ~16 hours, it crashed, while building the 110Mil. doc index

Participants:

 Description   

The replica set secondary crashed during initial sync, in the build index step, while building the index on a 110 Mil. docs large collection.

2015-03-19T08:44:16.010+0100 I -        [rsSync]   Index Build: 75617500/113700591 66%
2015-03-19T08:44:19.009+0100 I -        [rsSync]   Index Build: 75635700/113700591 66%
2015-03-19T08:44:22.003+0100 I -        [rsSync]   Index Build: 75654600/113700591 66%
2015-03-19T08:44:23.358+0100 I NETWORK  [conn3458] end connection 10.143.128.89:33110 (4 connections now open)
2015-03-19T08:44:23.359+0100 I NETWORK  [initandlisten] connection accepted from 10.143.128.89:33118 #3460 (5 connections now open)
2015-03-19T08:44:23.360+0100 I NETWORK  [conn3459] end connection 10.143.128.89:33111 (4 connections now open)
2015-03-19T08:44:23.360+0100 I NETWORK  [initandlisten] connection accepted from 10.143.128.89:33119 #3461 (5 connections now open)
2015-03-19T08:44:23.394+0100 I ACCESS   [conn3460] Successfully authenticated as principal __system on local
2015-03-19T08:44:23.395+0100 I ACCESS   [conn3461] Successfully authenticated as principal __system on local
2015-03-19T08:44:25.010+0100 I -        [rsSync]   Index Build: 75660700/113700591 66%
2015-03-19T08:44:38.168+0100 I -        [rsSync]   Index Build: 75690900/113700591 66%
2015-03-19T08:44:38.354+0100 E STORAGE  [rsSync] WiredTiger (0) [1426751078:349853][12207:0x7fb66ec0a700], file:flowdev/collection-173-1201567201290982410.wt, cursor.next: encountered an illegal file format or internal value
2015-03-19T08:44:38.354+0100 E STORAGE  [rsSync] WiredTiger (-31804) [1426751078:354864][12207:0x7fb66ec0a700], file:flowdev/collection-173-1201567201290982410.wt, cursor.next: the process must exit and restart: WT_PANIC: WiredTiger library panic
2015-03-19T08:44:38.354+0100 I -        [rsSync] Fatal Assertion 28558
2015-03-19T08:44:38.406+0100 I CONTROL  [rsSync]
 0xf58429 0xef69c1 0xedb021 0xd63086 0x1382250 0x1382515 0x13829b1 0x12e43c1 0x1325329 0xd69463 0xd69512 0x9f3092 0xbce524 0xbce8d4 0x91dbc0 0x929fb4 0xc8256b 0xc83044 0xc83e82 0xc8c5f1 0xfa5f44 0x7fb67c16b9d1 0x7fb67acbd8fd
----- BEGIN BACKTRACE -----
{"backtrace":[{"b":"400000","o":"B58429"},{"b":"400000","o":"AF69C1"},{"b":"400000","o":"ADB021"},{"b":"400000","o":"963086"},{"b":"400000","o":"F82250"},{"b":"400000","o":"F82515"},{"b":"400000","o":"F829B1"},{"b":"400000","o":"EE43C1"},{"b":"400000","o":"F25329"},{"b":"400000","o":"969463"},{"b":"400000","o":"969512"},{"b":"400000","o":"5F3092"},{"b":"400000","o":"7CE524"},{"b":"400000","o":"7CE8D4"},{"b":"400000","o":"51DBC0"},{"b":"400000","o":"529FB4"},{"b":"400000","o":"88256B"},{"b":"400000","o":"883044"},{"b":"400000","o":"883E82"},{"b":"400000","o":"88C5F1"},{"b":"400000","o":"BA5F44"},{"b":"7FB67C164000","o":"79D1"},{"b":"7FB67ABD5000","o":"E88FD"}],"processInfo":{ "mongodbVersion" : "3.0.0", "gitVersion" : "a841fd6394365954886924a35076691b4d149168", "uname" : { "sysname" : "Linux", "release" : "2.6.32-504.3.3.el6.x86_64", "version" : "#1 SMP Fri Dec 12 16:05:43 EST 2014", "machine" : "x86_64" }, "somap" : [ { "elfType" : 2, "b" : "400000", "buildId" : "FFFD21B8C7EC4ADC196E65832A0B272803A7A4F5" }, { "b" : "7FFF22DCE000", "elfType" : 3, "buildId" : "E752C57E2BD5883E5CE1211B21FC5859B4520D90" }, { "b" : "7FB67C164000", "path" : "/lib64/libpthread.so.0", "elfType" : 3, "buildId" : "A35053D76A6B7BD91D2EE58CC024D8EF697CE977" }, { "b" : "7FB67BEF8000", "path" : "/usr/lib64/libssl.so.10", "elfType" : 3, "buildId" : "58B33C1A58DAD354D36CB87FD14997F06BF1497D" }, { "b" : "7FB67BB15000", "path" : "/usr/lib64/libcrypto.so.10", "elfType" : 3, "buildId" : "71BC917ECEB443B79853AC793482A6BE9D468BC4" }, { "b" : "7FB67B90D000", "path" : "/lib64/librt.so.1", "elfType" : 3, "buildId" : "69BCB2B5FE6D85ACD898362EAC5EE79857DA4EC4" }, { "b" : "7FB67B709000", "path" : "/lib64/libdl.so.2", "elfType" : 3, "buildId" : "266172B083F783BD94389BE55B0B371C17198268" }, { "b" : "7FB67B403000", "path" : "/usr/lib64/libstdc++.so.6", "elfType" : 3, "buildId" : "ED99110E629209C5CA6C0ED704F2C5CE3171513A" }, { "b" : "7FB67B17F000", "path" : "/lib64/libm.so.6", "elfType" : 3, "buildId" : "A5F11596A3C6C24C2304A996DE37A1736C5715F9" }, { "b" : "7FB67AF69000", "path" : "/lib64/libgcc_s.so.1", "elfType" : 3, "buildId" : "A44499D29B114A5366CD72DD4883958495AC1C1D" }, { "b" : "7FB67ABD5000", "path" : "/lib64/libc.so.6", "elfType" : 3, "buildId" : "22AA38CCA59A5DF6CF07B8FC1778E2EE0384508E" }, { "b" : "7FB67C381000", "path" : "/lib64/ld-linux-x86-64.so.2", "elfType" : 3, "buildId" : "5BEB2450B75E84FF317C65F22AF8B8112C25DF63" }, { "b" : "7FB67A991000", "path" : "/lib64/libgssapi_krb5.so.2", "elfType" : 3, "buildId" : "EFF68B7DE77D081BC4A0CB38FE9DCBC60541BF92" }, { "b" : "7FB67A6AB000", "path" : "/lib64/libkrb5.so.3", "elfType" : 3, "buildId" : "95EBB74C2C0A1E1714344036145A0239FFA4892D" }, { "b" : "7FB67A4A7000", "path" : "/lib64/libcom_err.so.2", "elfType" : 3, "buildId" : "6ADE12F76961F73B33D160AC4D342222E7FC7A65" }, { "b" : "7FB67A27B000", "path" : "/lib64/libk5crypto.so.3", "elfType" : 3, "buildId" : "D02E7D3149950118009A81997434E28B7D9EC9B2" }, { "b" : "7FB67A065000", "path" : "/lib64/libz.so.1", "elfType" : 3, "buildId" : "D053BB4FF0C2FC983842F81598813B9B931AD0D1" }, { "b" : "7FB679E5A000", "path" : "/lib64/libkrb5support.so.0", "elfType" : 3, "buildId" : "5AFCBEA0D62EE0335714CCBAB7BA808E2A16028C" }, { "b" : "7FB679C57000", "path" : "/lib64/libkeyutils.so.1", "elfType" : 3, "buildId" : "8A8734DC37305D8CC2EF8F8C3E5EA03171DB07EC" }, { "b" : "7FB679A3D000", "path" : "/lib64/libresolv.so.2", "elfType" : 3, "buildId" : "C6DC308333BCC5E1DDD2A308F47AB0BFA318D1CC" }, { "b" : "7FB67981E000", "path" : "/lib64/libselinux.so.1", "elfType" : 3, "buildId" : "BAD5C71361DADF259B6E306A49E6F47F24AEA3DC" } ] }}
 mongod(_ZN5mongo15printStackTraceERSo+0x29) [0xf58429]
 mongod(_ZN5mongo10logContextEPKc+0xE1) [0xef69c1]
 mongod(_ZN5mongo13fassertFailedEi+0x61) [0xedb021]
 mongod(+0x963086) [0xd63086]
 mongod(+0xF82250) [0x1382250]
 mongod(__wt_err+0x95) [0x1382515]
 mongod(__wt_panic+0x21) [0x13829b1]
 mongod(__wt_btcur_next+0x2891) [0x12e43c1]
 mongod(+0xF25329) [0x1325329]
 mongod(_ZN5mongo21WiredTigerRecordStore8Iterator8_getNextEv+0x73) [0xd69463]
 mongod(_ZN5mongo21WiredTigerRecordStore8Iterator7getNextEv+0x12) [0xd69512]
 mongod(_ZN5mongo14CollectionScan4workEPm+0x2B2) [0x9f3092]
 mongod(_ZN5mongo12PlanExecutor18getNextSnapshottedEPNS_11SnapshottedINS_7BSONObjEEEPNS_8RecordIdE+0xA4) [0xbce524]
 mongod(_ZN5mongo12PlanExecutor7getNextEPNS_7BSONObjEPNS_8RecordIdE+0x34) [0xbce8d4]
 mongod(_ZN5mongo15MultiIndexBlock30insertAllDocumentsInCollectionEPSt3setINS_8RecordIdESt4lessIS2_ESaIS2_EE+0x130) [0x91dbc0]
 mongod(_ZN5mongo6Cloner2goEPNS_16OperationContextERKSsS4_RKNS_12CloneOptionsEPSt3setISsSt4lessISsESaISsEERSsPi+0xD04) [0x929fb4]
 mongod(+0x88256B) [0xc8256b]
 mongod(+0x883044) [0xc83044]
 mongod(_ZN5mongo4repl17syncDoInitialSyncEv+0x42) [0xc83e82]
 mongod(_ZN5mongo4repl13runSyncThreadEv+0x181) [0xc8c5f1]
 mongod(+0xBA5F44) [0xfa5f44]
 libpthread.so.0(+0x79D1) [0x7fb67c16b9d1]
 libc.so.6(clone+0x6D) [0x7fb67acbd8fd]
-----  END BACKTRACE  -----
2015-03-19T08:44:38.406+0100 I -        [rsSync]
 
***aborting after fassert() failure



 Comments   
Comment by Ramon Fernandez Marina [ 21/Apr/15 ]

bhcoba, it looks like this ticket is a duplicate of SERVER-17713, which has been fixed in version 3.0.2. I'd strongly recommend you upgrade to 3.0.2, in particular for all subsequent experiments related to this ticket.

I'm thus going to resolve this ticket, but if if during your experiments in the coming months you encounter the issue again please let us know.

Regards,
Ramón.

Comment by Borut Hadzialic [ 23/Mar/15 ]

The fresh initial sync succeeded this time with 3.0.1.

@Dan

  • Dell R720 hardware
  • OS is Red Hat Enterprise Linux Server release 6.5 (Santiago) and the kernel version 2.6.32-504.3.3.el6.x86_64 #1 SMP
  • no virtualization
  • nothing strange in syslog

The server is a development server where we test Mongo (and some of its competitors/forks, but at any give time only 1 database type / process is running on the server). In the last 4 months another product (a mongodb fork) was tested heavily on the server and there was no indication that the server was faulty somehow - everything worked pretty well.

We will repeat the initial sync procedure many times in the upcoming months - I will post again if we encounter the same issue.

Comment by Michael Cahill (Inactive) [ 23/Mar/15 ]

bhcoba, if you are able to try again, my recommendation would be to start with a fresh replica and try running:

mongod --storageEngine=wiredTiger --wiredTigerCollectionConfigString="checksum=on" --wiredTigerIndexConfigString="checksum=on" ...

By default, when compression is enabled, WiredTiger checksums each block header, and relies on compression to detect corruption. The above command line will calculate checksums for all blocks including compressed blocks, so if the failure is being caused by corruption, this should catch it sooner.

Comment by Daniel Pasette (Inactive) [ 21/Mar/15 ]

Hi Borut,
I'm trying to piece as many clues as possible. Can you help to fill in some details about your system?

  • OS and kernel version
  • Are there any messages in syslog from the time of the failures?
  • Have you tried syncing this node using the default snappy compression (i realize this is not what you're trying to do in the end, I'm just trying to narrow down the problem space)
Comment by Ramon Fernandez Marina [ 20/Mar/15 ]

Thanks for your report bhcoba, we're looking into it. Trying to reproduce on 3.0.1 is indeed the first step, so please let us know how that goes. When we know more about the issue we'll let you know if a separate ticket is needed – for now let's see what 3.0.1 does.

Comment by Borut Hadzialic [ 20/Mar/15 ]

Problem #2:
(This looks like a separate bug like from the one encountered yesterday / described in the story description. Should I open a new/separate Jira for this problem?)
A fresh initial sync to new/empty secondary failed this time even earlier, during the copy phase, at ~800GB from ~1TB total data:

2015-03-19T21:30:54.005+0100 I STORAGE  [rsSync] clone flowdev.ICOM__PRICE_TICKET 80521599   (<-- the collection has around ~113 Mil. docs in total)
2015-03-19T21:31:14.633+0100 I NETWORK  [conn1901] end connection 10.143.128.89:40719 (4 connections now open)
2015-03-19T21:31:14.634+0100 I NETWORK  [initandlisten] connection accepted from 10.143.128.89:40725 #1903 (5 connections now open)
2015-03-19T21:31:14.668+0100 I ACCESS   [conn1903] Successfully authenticated as principal __system on local
2015-03-19T21:31:16.719+0100 I NETWORK  [conn1902] end connection 10.143.128.89:40720 (4 connections now open)
2015-03-19T21:31:16.720+0100 I NETWORK  [initandlisten] connection accepted from 10.143.128.89:40728 #1904 (5 connections now open)
2015-03-19T21:31:16.754+0100 I ACCESS   [conn1904] Successfully authenticated as principal __system on local
2015-03-19T21:31:34.002+0100 I STORAGE  [rsSync] 80725811 objects cloned so far from collection flowdev.ICOM__PRICE_TICKET
2015-03-19T21:31:44.681+0100 I NETWORK  [conn1903] end connection 10.143.128.89:40725 (4 connections now open)
2015-03-19T21:31:44.682+0100 I NETWORK  [initandlisten] connection accepted from 10.143.128.89:40733 #1905 (5 connections now open)
2015-03-19T21:31:44.719+0100 I ACCESS   [conn1905] Successfully authenticated as principal __system on local
2015-03-19T21:31:46.762+0100 I NETWORK  [conn1904] end connection 10.143.128.89:40728 (4 connections now open)
2015-03-19T21:31:46.762+0100 I NETWORK  [initandlisten] connection accepted from 10.143.128.89:40736 #1906 (5 connections now open)
2015-03-19T21:31:46.797+0100 I ACCESS   [conn1906] Successfully authenticated as principal __system on local
2015-03-19T21:31:51.107+0100 E STORAGE  WiredTiger (22) [1426797111:101032][15891:0x7f17eeded700], checkpoint-server: checkpoint server error: Invalid argument
2015-03-19T21:31:51.107+0100 E STORAGE  WiredTiger (-31804) [1426797111:107963][15891:0x7f17eeded700], checkpoint-server: the process must exit and restart: WT_PANIC: WiredTiger library panic
2015-03-19T21:31:51.107+0100 I -        Fatal Assertion 28558
2015-03-19T21:31:51.108+0100 I -        [rsSync] Fatal Assertion 28559
2015-03-19T21:31:51.124+0100 I CONTROL
 0xf58429 0xef69c1 0xedb021 0xd63086 0x1382250 0x1382515 0x13829b1 0x1319ca3 0x7f17f49409d1 0x7f17f34928fd
----- BEGIN BACKTRACE -----
{"backtrace":[{"b":"400000","o":"B58429"},{"b":"400000","o":"AF69C1"},{"b":"400000","o":"ADB021"},{"b":"400000","o":"963086"},{"b":"400000","o":"F82250"},{"b":"400000","o":"F82515"},{"b":"400000","o":"F829B1"},{"b":"400000","o":"F19CA3"},{"b":"7F17F4939000","o":"79D1"},{"b":"7F17F33AA000","o":"E88FD"}],"processInfo":{ "mongodbVersion" : "3.0.0", "gitVersion" : "a841fd6394365954886924a35076691b4d149168", "uname" : { "sysname" : "Linux", "release" : "2.6.32-504.3.3.el6.x86_64", "version" : "#1 SMP Fri Dec 12 16:05:43 EST 2014", "machine" : "x86_64" }, "somap" : [ { "elfType" : 2, "b" : "400000", "buildId" : "FFFD21B8C7EC4ADC196E65832A0B272803A7A4F5" }, { "b" : "7FFFFD6FF000", "elfType" : 3, "buildId" : "E752C57E2BD5883E5CE1211B21FC5859B4520D90" }, { "b" : "7F17F4939000", "path" : "/lib64/libpthread.so.0", "elfType" : 3, "buildId" : "A35053D76A6B7BD91D2EE58CC024D8EF697CE977" }, { "b" : "7F17F46CD000", "path" : "/usr/lib64/libssl.so.10", "elfType" : 3, "buildId" : "58B33C1A58DAD354D36CB87FD14997F06BF1497D" }, { "b" : "7F17F42EA000", "path" : "/usr/lib64/libcrypto.so.10", "elfType" : 3, "buildId" : "71BC917ECEB443B79853AC793482A6BE9D468BC4" }, { "b" : "7F17F40E2000", "path" : "/lib64/librt.so.1", "elfType" : 3, "buildId" : "69BCB2B5FE6D85ACD898362EAC5EE79857DA4EC4" }, { "b" : "7F17F3EDE000", "path" : "/lib64/libdl.so.2", "elfType" : 3, "buildId" : "266172B083F783BD94389BE55B0B371C17198268" }, { "b" : "7F17F3BD8000", "path" : "/usr/lib64/libstdc++.so.6", "elfType" : 3, "buildId" : "ED99110E629209C5CA6C0ED704F2C5CE3171513A" }, { "b" : "7F17F3954000", "path" : "/lib64/libm.so.6", "elfType" : 3, "buildId" : "A5F11596A3C6C24C2304A996DE37A1736C5715F9" }, { "b" : "7F17F373E000", "path" : "/lib64/libgcc_s.so.1", "elfType" : 3, "buildId" : "A44499D29B114A5366CD72DD4883958495AC1C1D" }, { "b" : "7F17F33AA000", "path" : "/lib64/libc.so.6", "elfType" : 3, "buildId" : "22AA38CCA59A5DF6CF07B8FC1778E2EE0384508E" }, { "b" : "7F17F4B56000", "path" : "/lib64/ld-linux-x86-64.so.2", "elfType" : 3, "buildId" : "5BEB2450B75E84FF317C65F22AF8B8112C25DF63" }, { "b" : "7F17F3166000", "path" : "/lib64/libgssapi_krb5.so.2", "elfType" : 3, "buildId" : "EFF68B7DE77D081BC4A0CB38FE9DCBC60541BF92" }, { "b" : "7F17F2E80000", "path" : "/lib64/libkrb5.so.3", "elfType" : 3, "buildId" : "95EBB74C2C0A1E1714344036145A0239FFA4892D" }, { "b" : "7F17F2C7C000", "path" : "/lib64/libcom_err.so.2", "elfType" : 3, "buildId" : "6ADE12F76961F73B33D160AC4D342222E7FC7A65" }, { "b" : "7F17F2A50000", "path" : "/lib64/libk5crypto.so.3", "elfType" : 3, "buildId" : "D02E7D3149950118009A81997434E28B7D9EC9B2" }, { "b" : "7F17F283A000", "path" : "/lib64/libz.so.1", "elfType" : 3, "buildId" : "D053BB4FF0C2FC983842F81598813B9B931AD0D1" }, { "b" : "7F17F262F000", "path" : "/lib64/libkrb5support.so.0", "elfType" : 3, "buildId" : "5AFCBEA0D62EE0335714CCBAB7BA808E2A16028C" }, { "b" : "7F17F242C000", "path" : "/lib64/libkeyutils.so.1", "elfType" : 3, "buildId" : "8A8734DC37305D8CC2EF8F8C3E5EA03171DB07EC" }, { "b" : "7F17F2212000", "path" : "/lib64/libresolv.so.2", "elfType" : 3, "buildId" : "C6DC308333BCC5E1DDD2A308F47AB0BFA318D1CC" }, { "b" : "7F17F1FF3000", "path" : "/lib64/libselinux.so.1", "elfType" : 3, "buildId" : "BAD5C71361DADF259B6E306A49E6F47F24AEA3DC" } ] }}
 mongod(_ZN5mongo15printStackTraceERSo+0x29) [0xf58429]
 mongod(_ZN5mongo10logContextEPKc+0xE1) [0xef69c1]
 mongod(_ZN5mongo13fassertFailedEi+0x61) [0xedb021]
 mongod(+0x963086) [0xd63086]
 mongod(+0xF82250) [0x1382250]
 mongod(__wt_err+0x95) [0x1382515]
 mongod(__wt_panic+0x21) [0x13829b1]
 mongod(+0xF19CA3) [0x1319ca3]
 libpthread.so.0(+0x79D1) [0x7f17f49409d1]
 libc.so.6(clone+0x6D) [0x7f17f34928fd]
-----  END BACKTRACE  -----
2015-03-19T21:31:51.124+0100 I -
 
***aborting after fassert() failure

As MongoDB 3.0.1 is available, I will upgrade our replica set to from 3.0.0 to 3.0.1, and try to perform the initial sync to a fresh/empty 3.0.1 secondary again.

Comment by Borut Hadzialic [ 19/Mar/15 ]

Starting the secondary again made it try to build the index again, but then failed at same point:

2015-03-19T13:07:18.006+0100 I -        [initandlisten]   Index Build: 75594400/113700591 66%
2015-03-19T13:07:21.051+0100 I -        [initandlisten]   Index Build: 75616400/113700591 66%
2015-03-19T13:07:24.012+0100 I -        [initandlisten]   Index Build: 75637100/113700591 66%
2015-03-19T13:07:27.006+0100 I -        [initandlisten]   Index Build: 75658200/113700591 66%
2015-03-19T13:07:42.231+0100 I -        [initandlisten]   Index Build: 75690900/113700591 66%
2015-03-19T13:07:42.427+0100 E STORAGE  [initandlisten] WiredTiger (0) [1426766862:422515][36841:0x7f4432bc4c20], file:flowdev/collection-173-1201567201290982410.wt, cursor.next: encountered an illegal file format or internal value
2015-03-19T13:07:42.428+0100 E STORAGE  [initandlisten] WiredTiger (-31804) [1426766862:428104][36841:0x7f4432bc4c20], file:flowdev/collection-173-1201567201290982410.wt, cursor.next: the process must exit and restart: WT_PANIC: WiredTiger library panic
2015-03-19T13:07:42.428+0100 I -        [initandlisten] Fatal Assertion 28558
2015-03-19T13:07:42.457+0100 I CONTROL  [initandlisten]
 0xf58429 0xef69c1 0xedb021 0xd63086 0x1382250 0x1382515 0x13829b1 0x12e43c1 0x1325329 0xd69463 0xd69512 0x9f3092 0xbce524 0xbce8d4 0x91dbc0 0xaa57bc 0x7f3e80 0x7f8089 0x7f443122ed5d 0x7f0acd
----- BEGIN BACKTRACE -----
{"backtrace":[{"b":"400000","o":"B58429"},{"b":"400000","o":"AF69C1"},{"b":"400000","o":"ADB021"},{"b":"400000","o":"963086"},{"b":"400000","o":"F82250"},{"b":"400000","o":"F82515"},{"b":"400000","o":"F829B1"},{"b":"400000","o":"EE43C1"},{"b":"400000","o":"F25329"},{"b":"400000","o":"969463"},{"b":"400000","o":"969512"},{"b":"400000","o":"5F3092"},{"b":"400000","o":"7CE524"},{"b":"400000","o":"7CE8D4"},{"b":"400000","o":"51DBC0"},{"b":"400000","o":"6A57BC"},{"b":"400000","o":"3F3E80"},{"b":"400000","o":"3F8089"},{"b":"7F4431210000","o":"1ED5D"},{"b":"400000","o":"3F0ACD"}],"processInfo":{ "mongodbVersion" : "3.0.0", "gitVersion" : "a841fd6394365954886924a35076691b4d149168", "uname" : { "sysname" : "Linux", "release" : "2.6.32-504.3.3.el6.x86_64", "version" : "#1 SMP Fri Dec 12 16:05:43 EST 2014", "machine" : "x86_64" }, "somap" : [ { "elfType" : 2, "b" : "400000", "buildId" : "FFFD21B8C7EC4ADC196E65832A0B272803A7A4F5" }, { "b" : "7FFF7146E000", "elfType" : 3, "buildId" : "E752C57E2BD5883E5CE1211B21FC5859B4520D90" }, { "b" : "7F443279F000", "path" : "/lib64/libpthread.so.0", "elfType" : 3, "buildId" : "A35053D76A6B7BD91D2EE58CC024D8EF697CE977" }, { "b" : "7F4432533000", "path" : "/usr/lib64/libssl.so.10", "elfType" : 3, "buildId" : "58B33C1A58DAD354D36CB87FD14997F06BF1497D" }, { "b" : "7F4432150000", "path" : "/usr/lib64/libcrypto.so.10", "elfType" : 3, "buildId" : "71BC917ECEB443B79853AC793482A6BE9D468BC4" }, { "b" : "7F4431F48000", "path" : "/lib64/librt.so.1", "elfType" : 3, "buildId" : "69BCB2B5FE6D85ACD898362EAC5EE79857DA4EC4" }, { "b" : "7F4431D44000", "path" : "/lib64/libdl.so.2", "elfType" : 3, "buildId" : "266172B083F783BD94389BE55B0B371C17198268" }, { "b" : "7F4431A3E000", "path" : "/usr/lib64/libstdc++.so.6", "elfType" : 3, "buildId" : "ED99110E629209C5CA6C0ED704F2C5CE3171513A" }, { "b" : "7F44317BA000", "path" : "/lib64/libm.so.6", "elfType" : 3, "buildId" : "A5F11596A3C6C24C2304A996DE37A1736C5715F9" }, { "b" : "7F44315A4000", "path" : "/lib64/libgcc_s.so.1", "elfType" : 3, "buildId" : "A44499D29B114A5366CD72DD4883958495AC1C1D" }, { "b" : "7F4431210000", "path" : "/lib64/libc.so.6", "elfType" : 3, "buildId" : "22AA38CCA59A5DF6CF07B8FC1778E2EE0384508E" }, { "b" : "7F44329BC000", "path" : "/lib64/ld-linux-x86-64.so.2", "elfType" : 3, "buildId" : "5BEB2450B75E84FF317C65F22AF8B8112C25DF63" }, { "b" : "7F4430FCC000", "path" : "/lib64/libgssapi_krb5.so.2", "elfType" : 3, "buildId" : "EFF68B7DE77D081BC4A0CB38FE9DCBC60541BF92" }, { "b" : "7F4430CE6000", "path" : "/lib64/libkrb5.so.3", "elfType" : 3, "buildId" : "95EBB74C2C0A1E1714344036145A0239FFA4892D" }, { "b" : "7F4430AE2000", "path" : "/lib64/libcom_err.so.2", "elfType" : 3, "buildId" : "6ADE12F76961F73B33D160AC4D342222E7FC7A65" }, { "b" : "7F44308B6000", "path" : "/lib64/libk5crypto.so.3", "elfType" : 3, "buildId" : "D02E7D3149950118009A81997434E28B7D9EC9B2" }, { "b" : "7F44306A0000", "path" : "/lib64/libz.so.1", "elfType" : 3, "buildId" : "D053BB4FF0C2FC983842F81598813B9B931AD0D1" }, { "b" : "7F4430495000", "path" : "/lib64/libkrb5support.so.0", "elfType" : 3, "buildId" : "5AFCBEA0D62EE0335714CCBAB7BA808E2A16028C" }, { "b" : "7F4430292000", "path" : "/lib64/libkeyutils.so.1", "elfType" : 3, "buildId" : "8A8734DC37305D8CC2EF8F8C3E5EA03171DB07EC" }, { "b" : "7F4430078000", "path" : "/lib64/libresolv.so.2", "elfType" : 3, "buildId" : "C6DC308333BCC5E1DDD2A308F47AB0BFA318D1CC" }, { "b" : "7F442FE59000", "path" : "/lib64/libselinux.so.1", "elfType" : 3, "buildId" : "BAD5C71361DADF259B6E306A49E6F47F24AEA3DC" } ] }}
 mongod(_ZN5mongo15printStackTraceERSo+0x29) [0xf58429]
 mongod(_ZN5mongo10logContextEPKc+0xE1) [0xef69c1]
 mongod(_ZN5mongo13fassertFailedEi+0x61) [0xedb021]
 mongod(+0x963086) [0xd63086]
 mongod(+0xF82250) [0x1382250]
 mongod(__wt_err+0x95) [0x1382515]
 mongod(__wt_panic+0x21) [0x13829b1]
 mongod(__wt_btcur_next+0x2891) [0x12e43c1]
 mongod(+0xF25329) [0x1325329]
 mongod(_ZN5mongo21WiredTigerRecordStore8Iterator8_getNextEv+0x73) [0xd69463]
 mongod(_ZN5mongo21WiredTigerRecordStore8Iterator7getNextEv+0x12) [0xd69512]
 mongod(_ZN5mongo14CollectionScan4workEPm+0x2B2) [0x9f3092]
 mongod(_ZN5mongo12PlanExecutor18getNextSnapshottedEPNS_11SnapshottedINS_7BSONObjEEEPNS_8RecordIdE+0xA4) [0xbce524]
 mongod(_ZN5mongo12PlanExecutor7getNextEPNS_7BSONObjEPNS_8RecordIdE+0x34) [0xbce8d4]
 mongod(_ZN5mongo15MultiIndexBlock30insertAllDocumentsInCollectionEPSt3setINS_8RecordIdESt4lessIS2_ESaIS2_EE+0x130) [0x91dbc0]
 mongod(_ZN5mongo40restartInProgressIndexesFromLastShutdownEPNS_16OperationContextE+0x66C) [0xaa57bc]
 mongod(_ZN5mongo13initAndListenEi+0x1790) [0x7f3e80]
 mongod(main+0x139) [0x7f8089]
 libc.so.6(__libc_start_main+0xFD) [0x7f443122ed5d]
 mongod(+0x3F0ACD) [0x7f0acd]
-----  END BACKTRACE  -----
2015-03-19T13:07:42.457+0100 I -        [initandlisten]
 
***aborting after fassert() failure

I will drop the secondary data files and add it to the replica set again, to see if a fresh initial sync will work..

Generated at Thu Feb 08 03:45:10 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.