[SERVER-9209] Invalid access at address: 0; Got signal: 11 (Segmentation fault) Created: 02/Apr/13  Updated: 07/Apr/23  Resolved: 26/Jun/14

Status: Closed
Project: Core Server
Component/s: MapReduce
Affects Version/s: 2.4.1, 2.4.8
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Robert Beekman Assignee: Ramon Fernandez Marina
Resolution: Cannot Reproduce Votes: 1
Labels: core,, crash, replicaset
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Ubuntu 12.04 64bit


Operating System: Linux
Participants:

 Description   

Mongod crashes with a signal 11 (Segmentation fault) while running a map/reduce job.

We have 2 nodes in replica with an arbiter. The first node crashed yesterday 19:30 and the second node took over, just like it should.

Unfortunately the second node crashed as well at 06:35 this morning.

Here are the stacktraces and errors:

node1:

Mon Apr  1 19:30:52.984 [conn15995] warning: log line attempted (36k) over max size(10k), printing beginning and end ... remove production.collection query: { time: { $lte: new Date(1364243452263) }, exception.name: null, _id: { $nin: [ ObjectId('51335684f079d40200151cdd'), ObjectId('5134bfb570f998020028865f'), ObjectId('5135b43de8f6d10300328b11'), ObjectId('51389cf33049049a855a0a98'), ObjectId('5139f578ec693e020000af77'), ObjectId('513dd5b9ccf3019ae12f03a6'), ObjectId('5141a5c83c7f2388c20eede1'), ObjectId('514322badcc1370200240ba0'), ObjectId('5146ef8e48696e030092f872'), ObjectId('51482188381c3f88c2b1441f'), ObjectId('5149d33c381c3f88c2da16c5'), ObjectId('514b03413c07050300f5662e'), ObjectId('514c32b81c1dec02000d56eb'), ObjectId('514dd285f49a2d0200023df1'), ObjectId('51504c8498ad800200404b17'), ObjectId('5141c66fd8723502001080f4'), ObjectId('5133572d18779b0200158429'), ObjectId('5134bfd1f4ca021ae427bd52'), ObjectId('5135b43bf4ca021ae4328ac2'), ObjectId('51389cf33049049a855a0ab7'), ObjectId('5139ee258479e6020000562a'), ObjectId('513dd5c19cab8802002c2c1d'), ObjectId('5141a5df10d1d00200106974'), ObjectId('514322e6386c07b2df25456c'), ObjectId('5146d6c0a06354020090a101'), ObjectId('51487bf5089b5f0200b6ac96'), ObjectId('5149acc964a403c4c2d64dcb'), ObjectId('514b03413c07050300f5664c'), ObjectId('514c55a1ac30260200114720'), ObjectId('514d9f29583c01729001f893'), ObjectId('5134c4a8a4feff0200280300'), ObjectId('5139bafda454045ce46337a1'), ObjectId('513dd6f0f8eb5002002f1509'), ObjectId('514c3a77403b01fadf0a2445'), ObjectId('5134510e10d2015ce4214311'), ObjectId('5137a04d3049049a85503cd5'), ObjectId('5138a688dc430520e4583410'), ObjectId('51399288ec1b066485644d7d'), ObjectId('513b113ca05100c8d80b7813'), ObjectId('513f5a7b1c150cc8d84024b8'), ObjectId('5140de65fcef08aadf07dbde'), ObjectId('5141cb7864a403c4c210b412'), ObjectId('5146e95e3c07050300960896'), ObjectId('5148088608c4770200ab35b1'), ObjectId('514c8fd8fcc0c702000061c7'), ObjectId('5150211b64880218913b448f'), ObjectId('51342944a454045ce41fcf82'), ObjectId('5133a48aa4feff0200197a27'), ObjectId('513477f138348c020023f8db'), ObjectId('5135e38970f998020035ad5d'), ObjectId('51376eb19c210303004be915'), ObjectId('513914e910d2015ce45d3cf2'), ObjectId('513a622a587409d0d805aa4e'), ObjectId('513b4cdfa05100c8d80e45d2'), ObjectId('513c71c6fcdc5a03001d6777'), ObjectId('513e3f4310f503d0e135188b'), ObjectId('513f5fb158190dc8d84072a5'), ObjectId('5140c288589e66020006352e'), ObjectId('5141cc5ddca5e402001255bf'), ObjectId('5143406f089b5f02002a4d39'), ObjectId('514488d3a0635402005375fa'), ObjectId('514647c7a456af94c28b5f17'), ObjectId('51471e82d8723502009b6421'), ObjectId('51488eae3c7f2388c2bcaf16'), ObjectId('5149ee5cd872350200dce94f'), ObjectId('514ae34fa063540200ed2504'), ObjectId('514c1f8e089b5f02000792a3'), ObjectId('514dde0350fbd4020003bcfc'), ObjectId('514f1fb5b88c7c02001fc641'), ObjectId('5150921190010148f9137af7'), ObjectId('5133a4a3f4ca021ae4197b65'), ObjectId('51348c3230beaf0200255c26'), ObjectId('513762952849045ce44b011c'), ObjectId('513914c0dc430520e45d3b61'), ObjectId('5139e90fdcb13703000006c7'), ObjectId('513b623150b49803000f2ae7'), ObjectId('513c834b587409d0d81c2f3a'), ObjectId('513d48c2f8eb500200287091'), ObjectId('513f5a796ccc4c020041ae0d'), ObjectId('51409d25ac862988c203d9a2'), ObjectId('514249aea063540200180c22'), ObjectId('5143b59de0376302003b5570'), ObjectId('514647c7a456af94c28b5f11'), ObjectId('51471e44d8723502009b6038'), ObjectId('51488e9e08c4770200b84e6c'), ObjectId('514a15ef403b01fadfdabf95'), ObjectId('51 .......... jectId('51332fb930beaf02001302ec'), ObjectId('51348b033049049a85254dff'), ObjectId('5135e7a82849045ce43551df'), ObjectId('51375fc3a8619402004c6f85'), ObjectId('5138dd46c0e80303005affc1'), ObjectId('5139dd92940e039a8567bfc7'), ObjectId('513b614e1c685b9ae1109ec3'), ObjectId('513cdad3ccf3019ae123f3d4'), ObjectId('513e0a8968318202002f6696'), ObjectId('514054693c07050300001cf3'), ObjectId('5141f33edca5e402001479e8'), ObjectId('5142b35cdcc13702001a57f4'), ObjectId('5144e360bccb3888c261f635'), ObjectId('5145e4b9bc25b202007f4085'), ObjectId('5146ed68589e66020092bede'), ObjectId('5148888b3c7f2388c2bc2bb0'), ObjectId('5149ca4afcef08aadfd3a412'), ObjectId('514b5988381c3f88c2fcbb6f'), ObjectId('514c0bdbfcef08aadf056810'), ObjectId('514df79a683759020005bf2b'), ObjectId('514eea35c47bb302001cbd5d'), ObjectId('51501109f06d4f020035d449'), ObjectId('513297efa454045ce40c7d70'), ObjectId('5134655fa454045ce422338d'), ObjectId('5135e514f079d402003523c6'), ObjectId('513739db64880f26e4459741'), ObjectId('5138a70820ceb40200583e57'), ObjectId('513a6052ccf3019ae105a9fd'), ObjectId('513b10100c1b11c8d80b66e8'), ObjectId('513c87ffccf3019ae11f0a94'), ObjectId('513e00a61c685b9ae1315b16'), ObjectId('513f452a582bd402003ef5fa'), ObjectId('5140533a3c07050300000258'), ObjectId('514225f5e4626602001686c1'), ObjectId('51436c67dca5e402003050d5'), ObjectId('51447117482401fadf505994'), ObjectId('51461914f06ab394c2876214'), ObjectId('5146ec77089b5f020092a37d'), ObjectId('5148a8f35831380200beba16'), ObjectId('5149811764a403c4c2d1a66b'), ObjectId('514a9533ac30260200eaed91'), ObjectId('514c2e4244435d020008e46a'), ObjectId('514e234aa4293502000a80c0'), ObjectId('514ee90530a50484f81a5def'), ObjectId('51504a9d209c7302003d3515'), ObjectId('5134813220ceb40200241c50'), ObjectId('5135e6801012ef020035d1eb'), ObjectId('51376a69242d9802004d92b8'), ObjectId('5138996270f998020059df07'), ObjectId('513f385880294502003feee6'), ObjectId('5140770108c4770200021728'), ObjectId('51419928482401fadf0fc309'), ObjectId('5142c13708c47702001b9779'), ObjectId('51475da4386c07b2df9df5ba'), ObjectId('51482d94fcef08aadfae8c2d'), ObjectId('514b0939bccb3888c2f6074c'), ObjectId('514c301944435d0200091122'), ObjectId('514dad1b40d3bc020000dd34'), ObjectId('51509937442da9030016d8c0'), ObjectId('513465759830005e8522dac6'), ObjectId('513603d21012ef0200377c13'), ObjectId('513748b6f079d4020047e0fc'), ObjectId('513862ed38348c0200579488'), ObjectId('5139dcd508208f020067b2ed'), ObjectId('513b72b1fcdc5a030011adb0'), ObjectId('513c881128b45703001ca2ec'), ObjectId('513d9832fcdc5a03002b9775'), ObjectId('513f30466ccc4c02003f831c'), ObjectId('5140568d18fac20200003e55'), ObjectId('514248593c7f2388c216c7d6'), ObjectId('51432085ec1d09b4df24e86d'), ObjectId('514455653c070503004cee41'), ObjectId('5147073ae462660200959335'), ObjectId('51481fed70d6290300b126e7'), ObjectId('5149c407482401fadfd2e3e1'), ObjectId('514b221770d6290300f83630'), ObjectId('514c32c948696e0300095c17'), ObjectId('514d377040d3051aa1101c46'), ObjectId('514f606e90258e020025c327'), ObjectId('5150769698b61d3e8909d613'), ObjectId('514c437d58313802000f4f96'), ObjectId('514f58d71c1d003291289d63'), ObjectId('51503519a4293502003d25e5'), ObjectId('514b0caf1c1dec0200f64f12'), ObjectId('514c774b381c3f88c214a842'), ObjectId('515017ab28437303003a2dad'), ObjectId('514c669f7c4b31020012db5e'), ObjectId('515010facc9901020035d319') ] } } ndeleted:2 keyUpdates:0 numYields: 3 locks(micros) w:662836 371ms
Mon Apr  1 19:30:53.925 Invalid access at address: 0 from thread: conn15995
 
Mon Apr  1 19:30:53.929 Got signal: 11 (Segmentation fault).
 
Mon Apr  1 19:30:53.955 Backtrace:
0xdc7f71 0x6ce459 0x6ce9e2 0x7f4d30ce1cb0 0x6f74e3 0xd68613 0xd70ffb 0x87f888 0x898940 0x8cd0ca 0x8cfce5 0x8d1212 0xa7893b 0xa7c320 0x9ef5d4 0x9f08f2 0x6f0350 0xdb478e 0x7f4d30cd9e9a 0x7f4d2ffeccbd
 /usr/bin/mongod(_ZN5mongo15printStackTraceERSo+0x21) [0xdc7f71]
 /usr/bin/mongod(_ZN5mongo10abruptQuitEi+0x399) [0x6ce459]
 /usr/bin/mongod(_ZN5mongo24abruptQuitWithAddrSignalEiP7siginfoPv+0x262) [0x6ce9e2]
 /lib/x86_64-linux-gnu/libpthread.so.0(+0xfcb0) [0x7f4d30ce1cb0]
 /usr/bin/mongod(_ZNK5mongo7BSONObj4copyEv+0x23) [0x6f74e3]
 /usr/bin/mongod(_ZN5mongo7V8Scope11mongoToLZV8ERKNS_7BSONObjEb+0x3b3) [0xd68613]
 /usr/bin/mongod(_ZN5mongo7V8Scope6invokeEyPKNS_7BSONObjES3_ibbb+0x1fb) [0xd70ffb]
 /usr/bin/mongod(_ZN5mongo2mr8JSMapper3mapERKNS_7BSONObjE+0x68) [0x87f888]
 /usr/bin/mongod(_ZN5mongo2mr16MapReduceCommand3runERKSsRNS_7BSONObjEiRSsRNS_14BSONObjBuilderEb+0x930) [0x898940]
 /usr/bin/mongod(_ZN5mongo12_execCommandEPNS_7CommandERKSsRNS_7BSONObjEiRSsRNS_14BSONObjBuilderEb+0x3a) [0x8cd0ca]
 /usr/bin/mongod(_ZN5mongo7Command11execCommandEPS0_RNS_6ClientEiPKcRNS_7BSONObjERNS_14BSONObjBuilderEb+0x705) [0x8cfce5]
 /usr/bin/mongod(_ZN5mongo12_runCommandsEPKcRNS_7BSONObjERNS_11_BufBuilderINS_16TrivialAllocatorEEERNS_14BSONObjBuilderEbi+0x5e2) [0x8d1212]
 /usr/bin/mongod(_ZN5mongo11runCommandsEPKcRNS_7BSONObjERNS_5CurOpERNS_11_BufBuilderINS_16TrivialAllocatorEEERNS_14BSONObjBuilderEbi+0x3b) [0xa7893b]
 /usr/bin/mongod(_ZN5mongo8runQueryERNS_7MessageERNS_12QueryMessageERNS_5CurOpES1_+0xd50) [0xa7c320]
 /usr/bin/mongod() [0x9ef5d4]
 /usr/bin/mongod(_ZN5mongo16assembleResponseERNS_7MessageERNS_10DbResponseERKNS_11HostAndPortE+0x392) [0x9f08f2]
 /usr/bin/mongod(_ZN5mongo16MyMessageHandler7processERNS_7MessageEPNS_21AbstractMessagingPortEPNS_9LastErrorE+0x90) [0x6f0350]
 /usr/bin/mongod(_ZN5mongo17PortMessageServer17handleIncomingMsgEPv+0x42e) [0xdb478e]
 /lib/x86_64-linux-gnu/libpthread.so.0(+0x7e9a) [0x7f4d30cd9e9a]
 /lib/x86_64-linux-gnu/libc.so.6(clone+0x6d) [0x7f4d2ffeccbd]

node2:

Tue Apr  2 06:35:49.383 [conn12369] command production.$cmd command: { mapreduce: "collection", map: "
      function() {
        var key = {
          s: this.site_id,
          a: this.action,
          d: new Date(
            this.time.getFullYear(...", reduce: "
        function(key, values) {
          var r = {duration: 0.0, id: 0};
          values.forEach(function(v) {
            if(v.duration > r.durati...", query: { time: { $lt: new Date(1364870149275), $gte: new Date(1364283349275) }, exception.name: null }, out: { inline: 1 } } ntoreturn:1 keyUpdates:0 numYields: 2 locks(micros) r:153430 reslen:36963 105ms
Tue Apr  2 06:35:49.956 Invalid access at address: 0 from thread: conn12369
 
Tue Apr  2 06:35:49.958 Got signal: 11 (Segmentation fault).
 
Tue Apr  2 06:35:50.019 Backtrace:
0xdc7f71 0x6ce459 0x6ce9e2 0x7ffcf32f1cb0 0x6f74e3 0xd68613 0xd70ffb 0x87f888 0x898940 0x8cd0ca 0x8cfce5 0x8d1212 0xa7893b 0xa7c320 0x9ef5d4 0x9f08f2 0x6f0350 0xdb478e 0x7ffcf32e9e9a 0x7ffcf25fccbd
 /usr/bin/mongod(_ZN5mongo15printStackTraceERSo+0x21) [0xdc7f71]
 /usr/bin/mongod(_ZN5mongo10abruptQuitEi+0x399) [0x6ce459]
 /usr/bin/mongod(_ZN5mongo24abruptQuitWithAddrSignalEiP7siginfoPv+0x262) [0x6ce9e2]
 /lib/x86_64-linux-gnu/libpthread.so.0(+0xfcb0) [0x7ffcf32f1cb0]
 /usr/bin/mongod(_ZNK5mongo7BSONObj4copyEv+0x23) [0x6f74e3]
 /usr/bin/mongod(_ZN5mongo7V8Scope11mongoToLZV8ERKNS_7BSONObjEb+0x3b3) [0xd68613]
 /usr/bin/mongod(_ZN5mongo7V8Scope6invokeEyPKNS_7BSONObjES3_ibbb+0x1fb) [0xd70ffb]
 /usr/bin/mongod(_ZN5mongo2mr8JSMapper3mapERKNS_7BSONObjE+0x68) [0x87f888]
 /usr/bin/mongod(_ZN5mongo2mr16MapReduceCommand3runERKSsRNS_7BSONObjEiRSsRNS_14BSONObjBuilderEb+0x930) [0x898940]
 /usr/bin/mongod(_ZN5mongo12_execCommandEPNS_7CommandERKSsRNS_7BSONObjEiRSsRNS_14BSONObjBuilderEb+0x3a) [0x8cd0ca]
 /usr/bin/mongod(_ZN5mongo7Command11execCommandEPS0_RNS_6ClientEiPKcRNS_7BSONObjERNS_14BSONObjBuilderEb+0x705) [0x8cfce5]
 /usr/bin/mongod(_ZN5mongo12_runCommandsEPKcRNS_7BSONObjERNS_11_BufBuilderINS_16TrivialAllocatorEEERNS_14BSONObjBuilderEbi+0x5e2) [0x8d1212]
 /usr/bin/mongod(_ZN5mongo11runCommandsEPKcRNS_7BSONObjERNS_5CurOpERNS_11_BufBuilderINS_16TrivialAllocatorEEERNS_14BSONObjBuilderEbi+0x3b) [0xa7893b]
 /usr/bin/mongod(_ZN5mongo8runQueryERNS_7MessageERNS_12QueryMessageERNS_5CurOpES1_+0xd50) [0xa7c320]
 /usr/bin/mongod() [0x9ef5d4]
 /usr/bin/mongod(_ZN5mongo16assembleResponseERNS_7MessageERNS_10DbResponseERKNS_11HostAndPortE+0x392) [0x9f08f2]
 /usr/bin/mongod(_ZN5mongo16MyMessageHandler7processERNS_7MessageEPNS_21AbstractMessagingPortEPNS_9LastErrorE+0x90) [0x6f0350]
 /usr/bin/mongod(_ZN5mongo17PortMessageServer17handleIncomingMsgEPv+0x42e) [0xdb478e]
 /lib/x86_64-linux-gnu/libpthread.so.0(+0x7e9a) [0x7ffcf32e9e9a]
 /lib/x86_64-linux-gnu/libc.so.6(clone+0x6d) [0x7ffcf25fccbd]



 Comments   
Comment by Ramon Fernandez Marina [ 26/Jun/14 ]

All, since we haven't heard back for a while I'm going to mark this ticket as resolved. If this is still an issue, feel free to reopen and please provide additional information so we can try to reproduce on our end.

Regards,
Ramón.

Comment by Ramon Fernandez Marina [ 05/Jun/14 ]

krauss, I'm unable to reproduce with some sample documents. Is it possible for you to send us your dataset? If not, can you at least send us one of your documents to see how they look?

db.collection.findOne()

Also, the output of db.collection.stats() could be useful in trying to make progress on this issue.

Regards,
Ramón.

Comment by Kevin Krauss [ 12/Feb/14 ]

Still happening

Wed Feb 12 11:21:49.327 [conn17] auth: couldn't find user yogi@yogi_berra, yogi_berra.system.users
Wed Feb 12 11:36:48.282 [conn15] end connection 127.0.0.1:49873 (1 connection now open)
Wed Feb 12 11:37:22.812 [conn17] end connection 127.0.0.1:49977 (0 connections now open)
Wed Feb 12 14:32:36.572 [initandlisten] connection accepted from 127.0.0.1:51840 #18 (1 connection now open)
Wed Feb 12 14:52:18.262 Invalid access at address: 0x10 from thread: conn18

Wed Feb 12 14:52:18.262 Got signal: 11 (Segmentation fault: 11).

Wed Feb 12 14:52:18.282 Backtrace:
0x1011c19e0 0x100cd227d 0x100cd25b8 0x7fff92d6a5aa 0x10f51b368 0x1012f4097 0x1013c1699 0x1013c1501 0x10117b5cb 0x10117b48f 0x10117636a 0x1011757cf 0x100e1d675 0x100e270a7 0x100e4b055 0x100e4c013 0x100e4cdf6 0x100f6104d 0x100f67468 0x100f0492a
0 mongod 0x00000001011c19e0 _ZN5mongo15printStackTraceERSo + 64
1 mongod 0x0000000100cd227d _ZN5mongo10abruptQuitEi + 397
2 mongod 0x0000000100cd25b8 ZN5mongo24abruptQuitWithAddrSignalEiP9_siginfoPv + 344
3 libsystem_platform.dylib 0x00007fff92d6a5aa _sigtramp + 26
4 ??? 0x000000010f51b368 0x0 + 4551979880
5 mongod 0x00000001012f4097 _ZN2v88internal15DeoptimizerDataD1Ev + 55
6 mongod 0x00000001013c1699 _ZN2v88internal7Isolate6DeinitEv + 105
7 mongod 0x00000001013c1501 _ZN2v88internal7Isolate8TearDownEv + 81
8 mongod 0x000000010117b5cb _ZN5mongo7V8ScopeD2Ev + 267
9 mongod 0x000000010117b48f _ZN5mongo7V8ScopeD0Ev + 15
10 mongod 0x000000010117636a _ZN5mongo11PooledScopeD2Ev + 842
11 mongod 0x00000001011757cf _ZN5mongo11PooledScopeD0Ev + 15
12 mongod 0x0000000100e1d675 _ZN5mongo2mr5StateD2Ev + 341
13 mongod 0x0000000100e270a7 _ZN5mongo2mr16MapReduceCommand3runERKSsRNS_7BSONObjEiRSsRNS_14BSONObjBuilderEb + 6455
14 mongod 0x0000000100e4b055 _ZN5mongo12_execCommandEPNS_7CommandERKSsRNS_7BSONObjEiRSsRNS_14BSONObjBuilderEb + 37
15 mongod 0x0000000100e4c013 _ZN5mongo7Command11execCommandEPS0_RNS_6ClientEiPKcRNS_7BSONObjERNS_14BSONObjBuilderEb + 2915
16 mongod 0x0000000100e4cdf6 _ZN5mongo12_runCommandsEPKcRNS_7BSONObjERNS_11_BufBuilderINS_16TrivialAllocatorEEERNS_14BSONObjBuilderEbi + 886
17 mongod 0x0000000100f6104d _ZN5mongo11runCommandsEPKcRNS_7BSONObjERNS_5CurOpERNS_11_BufBuilderINS_16TrivialAllocatorEEERNS_14BSONObjBuilderEbi + 45
18 mongod 0x0000000100f67468 ZN5mongo8runQueryERNS_7MessageERNS_12QueryMessageERNS_5CurOpES1 + 1112
19 mongod 0x0000000100f0492a _ZN5mongo16assembleResponseERNS_7MessageERNS_10DbResponseERKNS_11HostAndPortE + 1338

Comment by Kevin Krauss [ 19/Dec/13 ]

This is still happening on 2.4.8 on mac OSX !! please fix !!

/mongodb", journal: "true", port: 27017, rest: "true", smallfiles: "true" }
Wed Dec 18 15:56:19.762 [initandlisten] journal dir=/usr/local/var/mongodb/journal
Wed Dec 18 15:56:19.762 [initandlisten] recover begin
Wed Dec 18 15:56:19.762 [initandlisten] recover lsn: 0
Wed Dec 18 15:56:19.762 [initandlisten] recover /usr/local/var/mongodb/journal/j._0
Wed Dec 18 15:56:19.763 [initandlisten] recover cleaning up
Wed Dec 18 15:56:19.763 [initandlisten] removeJournalFiles
Wed Dec 18 15:56:19.771 [initandlisten] recover done
Wed Dec 18 15:56:19.772 [initandlisten] preallocating a journal file /usr/local/var/mongodb/journal/prealloc.0
Wed Dec 18 15:56:19.950 [websvr] admin web console waiting for connections on port 28017
Wed Dec 18 15:56:19.950 [initandlisten] waiting for connections on port 27017
Wed Dec 18 15:56:22.499 [initandlisten] connection accepted from 127.0.0.1:51656 #1 (1 connection now open)
Wed Dec 18 16:16:45.377 [conn1] end connection 127.0.0.1:51656 (0 connections now open)
Thu Dec 19 08:16:27.430 [initandlisten] connection accepted from 127.0.0.1:53431 #2 (1 connection now open)
Thu Dec 19 08:17:45.014 [initandlisten] connection accepted from 127.0.0.1:53467 #3 (2 connections now open)
Thu Dec 19 08:44:05.201 [conn2] end connection 127.0.0.1:53431 (1 connection now open)
Thu Dec 19 08:44:05.201 [conn3] end connection 127.0.0.1:53467 (0 connections now open)
Thu Dec 19 09:44:59.471 [initandlisten] connection accepted from 127.0.0.1:54162 #4 (1 connection now open)
Thu Dec 19 09:59:59.860 [conn4] end connection 127.0.0.1:54162 (0 connections now open)
Thu Dec 19 11:48:54.714 [initandlisten] connection accepted from 127.0.0.1:55933 #5 (1 connection now open)
Thu Dec 19 12:14:20.490 [conn5] end connection 127.0.0.1:55933 (0 connections now open)
Thu Dec 19 13:42:20.758 [initandlisten] connection accepted from 127.0.0.1:57122 #6 (1 connection now open)
Thu Dec 19 13:49:11.769 [initandlisten] connection accepted from 127.0.0.1:57214 #7 (2 connections now open)
Thu Dec 19 14:03:35.001 Invalid access at address: 0x10 from thread: conn7

Thu Dec 19 14:03:35.001 Got signal: 11 (Segmentation fault: 11).

Thu Dec 19 14:03:35.005 Backtrace:
0x10ce589e0 0x10c96927d 0x10c9695b8 0x7fff985025aa 0 0x10cf8b097 0x10d058699 0x10d058501 0x10ce125cb 0x10ce1248f 0x10ce0d36a 0x10ce0c7cf 0x10cab4675 0x10cabe0a7 0x10cae2055 0x10cae3013 0x10cae3df6 0x10cbf804d 0x10cbfe468 0x10cb9b92a
0 mongod 0x000000010ce589e0 _ZN5mongo15printStackTraceERSo + 64
1 mongod 0x000000010c96927d _ZN5mongo10abruptQuitEi + 397
2 mongod 0x000000010c9695b8 ZN5mongo24abruptQuitWithAddrSignalEiP9_siginfoPv + 344
3 libsystem_platform.dylib 0x00007fff985025aa _sigtramp + 26
4 ??? 0x0000000000000000 0x0 + 0
5 mongod 0x000000010cf8b097 _ZN2v88internal15DeoptimizerDataD1Ev + 55
6 mongod 0x000000010d058699 _ZN2v88internal7Isolate6DeinitEv + 105
7 mongod 0x000000010d058501 _ZN2v88internal7Isolate8TearDownEv + 81
8 mongod 0x000000010ce125cb _ZN5mongo7V8ScopeD2Ev + 267
9 mongod 0x000000010ce1248f _ZN5mongo7V8ScopeD0Ev + 15
10 mongod 0x000000010ce0d36a _ZN5mongo11PooledScopeD2Ev + 842
11 mongod 0x000000010ce0c7cf _ZN5mongo11PooledScopeD0Ev + 15
12 mongod 0x000000010cab4675 _ZN5mongo2mr5StateD2Ev + 341
13 mongod 0x000000010cabe0a7 _ZN5mongo2mr16MapReduceCommand3runERKSsRNS_7BSONObjEiRSsRNS_14BSONObjBuilderEb + 6455
14 mongod 0x000000010cae2055 _ZN5mongo12_execCommandEPNS_7CommandERKSsRNS_7BSONObjEiRSsRNS_14BSONObjBuilderEb + 37
15 mongod 0x000000010cae3013 _ZN5mongo7Command11execCommandEPS0_RNS_6ClientEiPKcRNS_7BSONObjERNS_14BSONObjBuilderEb + 2915
16 mongod 0x000000010cae3df6 _ZN5mongo12_runCommandsEPKcRNS_7BSONObjERNS_11_BufBuilderINS_16TrivialAllocatorEEERNS_14BSONObjBuilderEbi + 886
17 mongod 0x000000010cbf804d _ZN5mongo11runCommandsEPKcRNS_7BSONObjERNS_5CurOpERNS_11_BufBuilderINS_16TrivialAllocatorEEERNS_14BSONObjBuilderEbi + 45
18 mongod 0x000000010cbfe468 ZN5mongo8runQueryERNS_7MessageERNS_12QueryMessageERNS_5CurOpES1 + 1112
19 mongod 0x000000010cb9b92a _ZN5mongo16assembleResponseERNS_7MessageERNS_10DbResponseERKNS_11HostAndPortE + 1338

Here are my map reduce functions, in ruby/javascript

map = %Q{
function() {
emit(this.error_message,

{ count: 1, date: this.created_at, id: this._id }

);
}
}

reduce = %Q{
function(key, values) {
var result =

{count: 0, dates: [], ids: []}

;
values.forEach(function(value){
if (value.date)

{ result.count += value.count; result.dates.push(value.date.valueOf()); result.ids.push(value.id) }

});
return result;
}
}

map = %Q{
function() {
emit(this.backtraces[0],

{ count: 1, date: this.created_at, id: this._id, remote_address: this.remote_address, user_agent: this.user_agent }

);
}
}

reduce = %Q{
function(key, values) {
var result = {count: 0, dates: [], ids: [], agents: {}, ips: {}};
values.forEach(function(value){
if (value.date) {
result.count += value.count;
result.dates.push(value.date.valueOf());
result.ids.push(value.id);
if (result.ips.hasOwnProperty(value.remote_address))

{ result.ips[value.remote_address] += 1 }

else

{ result.ips[value.remote_address] = 1 }

if (result.agents.hasOwnProperty(value.user_agent))

{ result.agents[value.user_agent] += 1 }

else

{ result.agents[value.user_agent] = 1 }

}
});
return result;
}
}

It will happen once every 4 out of 10 times. Also it happens when I use the mongo shell as well as the mongo ruby driver as well as mongoid which uses the moped driver.

Here is my mongo.conf

  1. Store data in /usr/local/var/mongodb instead of the default /data/db
    dbpath = /usr/local/var/mongodb
    port = 27017
  2. Only accept local connections
    bind_ip = 127.0.0.1
    journal = true
    smallfiles = true
    rest = true

also referenced in this post https://jira.mongodb.org/browse/SERVER-4441?focusedCommentId=469984&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-469984

Comment by Ruslan Sologub [ 09/May/13 ]

Hi Tad,

Unfortunately I can't post the stack or log but it was doing just some regular inserts and updates, nothing special and no map/reduce. Version was 2.4.1 when crashed, now it is 2.4.3, I'll get back if this one crashes too.

P.S. We use SSH tunnels to link all the replica members.

Ruslan

Comment by Tad Marshall [ 05/May/13 ]

Hi Ruslan,

Can you post the stack trace from your crash, with a few dozen lines from before the crash for context (i.e. what was going on when the crash happened), please?

Alternatively, could you attach the log file?

What version of MongoDB are you running?

Tad

Comment by Ruslan Sologub [ 05/May/13 ]

Got this error too on our production server, but we use no map/reduce at all. We have 5 nodes in replica, two of which are hidden and usually down. The error has been thrown by a secondary member.

Comment by Ben Becker [ 10/Apr/13 ]

Hi Robert,

Could you attach a few additional pieces of information?

  1. The MapReduce script and command used to execute the script
  2. Logs that start from the time the MapReduce job began through the crash
  3. A dump of the dataset used for this MapReduce job

For the dump, if you could use the mongodump command with the query specifier used in the MapReduce jobs, that would be extremely helpful. If this data is sensitive, I would be happy to provide a secure SCP server you can upload to.

Comment by Robert Beekman [ 02/Apr/13 ]

Just some additional info.

We run the same mapreduce job with the same data on another host that runs mongo 2.2.3 and has no issues at all.

Generated at Thu Feb 08 03:19:42 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.