[SERVER-18762] Mongo 3.0 crashes while replicating map reduce collections Created: 01/Jun/15  Updated: 15/Nov/21  Resolved: 18/Jun/15

Status: Closed
Project: Core Server
Component/s: Replication
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Critical - P2
Reporter: Amit Karan Assignee: Ramon Fernandez Marina
Resolution: Duplicate Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: File ireland-mongo-crash.tar     File primary.crash.log.gz     File secondary.crash.log.gz     File tokyo-crash-logs.tar    
Issue Links:
Duplicate
duplicates SERVER-17923 Creating/dropping multiple background... Closed
Operating System: ALL
Steps To Reproduce:

1. Setup a primary DB with enough data - my DB is 200GB+
2. Create a map reduce process using temp collections
3. Keep running the map reduce every few mins - enough to
4. Add a secondary replica set

Participants:

 Description   

Mongo DB keeps crashing while replicating in startup2 phase. Primary Mongo DB is version 2.6.9. New replica set added are 3.0.3 version with WiredTiger storage.

It seems it crashes while creating index for temporary collections used for map reduce.

----- BEGIN BACKTRACE -----
{"backtrace":[{"b":"400000","o":"B6D609"},{"b":"400000","o":"B0D341"},{"b":"400000","o":"AF1EC1"},{"b":"400000","o":"6C698D"},{"b":"400000","o":"AF4B00"},{"b":"400000","o":"BBAB64"},{"b":"7F41BD8AB000","o":"7DF3"},{"b":"7F41BC270000","o":"F61AD"}],"processInfo":{ "mongodbVersion" : "3.0.3", "gitVersion" : "b40106b36eecd1b4407eb1ad1af6bc60593c6105", "uname" : { "sysname" : "Linux", "release" : "3.14.42-31.38.amzn1.x86_64", "version" : "#1 SMP Wed May 13 20:33:05 UTC 2015", "machine" : "x86_64" }, "somap" : [ { "elfType" : 2, "b" : "400000", "buildId" : "D5BA38CEF0AE14E7FFF3FA98A110C841B25EAA21" }, { "b" : "7FFD06E66000", "elfType" : 3, "buildId" : "C7F24184C312347AAE8F28C92AC21288D3975482" }, { "b" : "7F41BD8AB000", "path" : "/lib64/libpthread.so.0", "elfType" : 3, "buildId" : "D48D3E6672A77B603B402F661BABF75E90AD570B" }, { "b" : "7F41BD63E000", "path" : "/usr/lib64/libssl.so.10", "elfType" : 3, "buildId" : "F711D67FF0C1FE2222FB003A30AB74DA26A5EF41" }, { "b" : "7F41BD259000", "path" : "/lib64/libcrypto.so.10", "elfType" : 3, "buildId" : "777069F5EECC26CD66C5C8390FA2BF4E444979D1" }, { "b" : "7F41BD051000", "path" : "/lib64/librt.so.1", "elfType" : 3, "buildId" : "E81013CBFA409053D58A65A0653271AB665A4619" }, { "b" : "7F41BCE4D000", "path" : "/lib64/libdl.so.2", "elfType" : 3, "buildId" : "62A8842157C62F95C3069CBF779AFCC26577A99A" }, { "b" : "7F41BCB49000", "path" : "/usr/lib64/libstdc++.so.6", "elfType" : 3, "buildId" : "DD6383EEAC49E9BAA9E3D1080AE932F42CF8A385" }, { "b" : "7F41BC847000", "path" : "/lib64/libm.so.6", "elfType" : 3, "buildId" : "5F97F8F8E5024E29717CF35998681F84D4A22D45" }, { "b" : "7F41BC631000", "path" : "/lib64/libgcc_s.so.1", "elfType" : 3, "buildId" : "C52958E393BDF8E8D090F36DE0F4E620D8736FBF" }, { "b" : "7F41BC270000", "path" : "/lib64/libc.so.6", "elfType" : 3, "buildId" : "A14FC690F08FB799BA8CC82D49DE9AA9D4580464" }, { "b" : "7F41BDAC7000", "path" : "/lib64/ld-linux-x86-64.so.2", "elfType" : 3, "buildId" : "6F90843B9087FE91955FEB0355EB0858EF9E97B2" }, { "b" : "7F41BC02D000", "path" : "/lib64/libgssapi_krb5.so.2", "elfType" : 3, "buildId" : "DE5A9F7A11A0881CB64E375F4DDCA58028F0FAF8" }, { "b" : "7F41BBD48000", "path" : "/lib64/libkrb5.so.3", "elfType" : 3, "buildId" : "A3E43FC66908AC8B00773707FECA3B1677AFF311" }, { "b" : "7F41BBB45000", "path" : "/usr/lib64/libcom_err.so.2", "elfType" : 3, "buildId" : "622F315EB5CB2F791E9B64020692EBA98195D06D" }, { "b" : "7F41BB91A000", "path" : "/lib64/libk5crypto.so.3", "elfType" : 3, "buildId" : "B10FBFEC246C4EAD1719D16090D0BE54904BBFC9" }, { "b" : "7F41BB704000", "path" : "/lib64/libz.so.1", "elfType" : 3, "buildId" : "89C6AF118B6B4FB6A73AE1813E2C8BDD722956D1" }, { "b" : "7F41BB4F9000", "path" : "/lib64/libkrb5support.so.0", "elfType" : 3, "buildId" : "7292C0673D7C116E3389D3FFA67087A6B9287A71" }, { "b" : "7F41BB2F6000", "path" : "/lib64/libkeyutils.so.1", "elfType" : 3, "buildId" : "37A58210FA50C91E09387765408A92909468D25B" }, { "b" : "7F41BB0DC000", "path" : "/lib64/libresolv.so.2", "elfType" : 3, "buildId" : "6A7DA1CED90F65F27CB7B5BACDBB1C386C05F592" }, { "b" : "7F41BAEBB000", "path" : "/usr/lib64/libselinux.so.1", "elfType" : 3, "buildId" : "F5054DC94443326819FBF3065CFDF5E4726F57EE" } ] }}
 mongod(_ZN5mongo15printStackTraceERSo+0x29) [0xf6d609]
 mongod(_ZN5mongo10logContextEPKc+0xE1) [0xf0d341]
 mongod(_ZN5mongo13fassertFailedEi+0x61) [0xef1ec1]
 mongod(_ZN5mongo12IndexBuilder3runEv+0x57D) [0xac698d]
 mongod(_ZN5mongo13BackgroundJob7jobBodyEv+0x120) [0xef4b00]
 mongod(+0xBBAB64) [0xfbab64]
 libpthread.so.0(+0x7DF3) [0x7f41bd8b2df3]
 libc.so.6(clone+0x6D) [0x7f41bc3661ad]
-----  END BACKTRACE  -----
2015-06-01T09:29:51.479+0000 I -        [repl index builder 626]
 
***aborting after fassert() failure



 Comments   
Comment by Ramon Fernandez Marina [ 18/Jun/15 ]

akaran, I'm convinced this ticket is a duplicate of SERVER-17923, so I'm going to close this ticket. SERVER-17923 was fixed in 3.0.4, which was recently released, so I'd suggest you upgrade to 3.0.4 at your earliest convenience and confirm that it solves your problem.

Thanks,
Ramón.

Comment by Ramon Fernandez Marina [ 08/Jun/15 ]

akaran@mobeam.com, I believe you're running into SERVER-17923, which has been fixed but not released yet. We're in the process of releasing a 3.0.4-rc0 release candidate, which should be available for download in a day or two.

If you can't wait for 3.0.4-rc0 I can point you to a nightly build on the v3.0 branch that includes a fix for this bug; otherwise I'd recommend you download and install 3.0.4-rc0 when it becomes available, and once you've confirmed that this bug is fixed you can move to the upcoming 3.0.4 release which, of everything goes according to plan, should be available in a week or two.

Regards,
Ramón.

Comment by Amit Karan [ 08/Jun/15 ]

We had another crash. Here are log files from primary and secondary DBs - primary.crash.log & secondary.crash.log.

Our Primary DB has also been upgraded to 3.0.3 with WiredTiger storage.

Comment by Amit Karan [ 01/Jun/15 ]

I have attached the crash logs from two data centers secondary servers here. I am unable to attach logs from primary DB (around 8 GB) and another secondary (around 900 MB) that were being used to replicate data. I will put link to dropbox here.

Comment by Ramon Fernandez Marina [ 01/Jun/15 ]

akaran, can you please upload full logs from startup until the moment you see the backtrace above for both your primary and the secondary that crashes?

Thanks,
Ramón.

Generated at Thu Feb 08 03:48:40 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.