[SERVER-6557] when running 2 or more MapReduce in parallel mongo crash Created: 23/Jul/12  Updated: 31/Jul/12  Resolved: 31/Jul/12

Status: Closed
Project: Core Server
Component/s: Concurrency, MapReduce
Affects Version/s: 2.2.0-rc0
Fix Version/s: None

Type: Bug Priority: Critical - P2
Reporter: izek greenfield Assignee: siddharth.singh@10gen.com
Resolution: Cannot Reproduce Votes: 2
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

linux RH 5.5


Operating System: Linux
Participants:

 Description   

Sun Jul 22 13:04:32 [conn5] CMD: drop hoursDatabase.tmp.mr.minuteid_instanceid_90
Sun Jul 22 13:04:32 [conn5] CMD: drop hoursDatabase.tmp.mr.minuteid_instanceid_90
Sun Jul 22 13:04:32 [conn5] CMD: drop hoursDatabase.tmp.mr.minuteid_instanceid_90_inc
Sun Jul 22 13:04:32 [conn6] build index minutesDatabase.tmp.mr.secondid_instanceid_status_91_inc

{ 0: 1 }

Sun Jul 22 13:04:32 Invalid access at address: 0xc0 from thread: conn5

Sun Jul 22 13:04:32 Got signal: 11 (Segmentation fault).

Sun Jul 22 13:04:32 [conn6] build index done. scanned 0 total records. 0 secs Sun Jul 22 13:04:32 [conn6] CMD: drop minutesDatabase.tmp.mr.secondid_instanceid_status_91
Sun Jul 22 13:04:32 [conn6] build index minutesDatabase.tmp.mr.secondid_instanceid_status_91

{ _id: 1 }

Sun Jul 22 13:04:32 [conn6] build index done. scanned 0 total records. 0 secs Sun Jul 22 13:04:32 Backtrace:
0x6236d1 0x5516b9 0x551c42 0x35c8e0e4c0 0xbb2d15 0xbaa638 0xbafefb 0x5b7600 0x7780f1 0x77923d 0x77a1cc 0x757be8 0x75b3f1 0x67284d 0x673b92 0x56a392 0xabe091 0x35c8e06367 0x35c82d2f7d
./mongod(_ZN5mongo15printStackTraceERSo+0x21) [0x6236d1]
./mongod(_ZN5mongo10abruptQuitEi+0x399) [0x5516b9]
./mongod(_ZN5mongo24abruptQuitWithAddrSignalEiP7siginfoPv+0x262) [0x551c42]
/lib64/libpthread.so.0 [0x35c8e0e4c0]
./mongod(ZNSt8_Rb_treeIN5mongo8ByLocKeyESt4pairIKS1_PNS0_12ClientCursorEESt10_Select1stIS6_ESt4lessIS1_ESaIS6_EE5eraseERS3+0x15) [0xbb2d15]
./mongod(_ZN5mongo12ClientCursor17setLastLoc_inlockENS_7DiskLocE+0x138) [0xbaa638]
./mongod(_ZN5mongo12ClientCursorD1Ev+0x4b) [0xbafefb]
./mongod(_ZN5mongo2mr16MapReduceCommand3runERKSsRNS_7BSONObjEiRSsRNS_14BSONObjBuilderEb+0xbd0) [0x5b7600]
./mongod(_ZN5mongo12_execCommandEPNS_7CommandERKSsRNS_7BSONObjEiRNS_14BSONObjBuilderEb+0x51) [0x7780f1]
./mongod(_ZN5mongo11execCommandEPNS_7CommandERNS_6ClientEiPKcRNS_7BSONObjERNS_14BSONObjBuilderEb+0xd0d) [0x77923d]
./mongod(_ZN5mongo12_runCommandsEPKcRNS_7BSONObjERNS_11_BufBuilderINS_16TrivialAllocatorEEERNS_14BSONObjBuilderEbi+0x2ac) [0x77a1cc]
./mongod(_ZN5mongo11runCommandsEPKcRNS_7BSONObjERNS_5CurOpERNS_11_BufBuilderINS_16TrivialAllocatorEEERNS_14BSONObjBuilderEbi+0x38) [0x757be8]
./mongod(ZN5mongo8runQueryERNS_7MessageERNS_12QueryMessageERNS_5CurOpES1+0xbc1) [0x75b3f1] ./mongod [0x67284d]
./mongod(_ZN5mongo16assembleResponseERNS_7MessageERNS_10DbResponseERKNS_11HostAndPortE+0x3a2) [0x673b92]
./mongod(_ZN5mongo16MyMessageHandler7processERNS_7MessageEPNS_21AbstractMessagingPortEPNS_9LastErrorE+0x82) [0x56a392]
./mongod(_ZN5mongo3pms9threadRunEPNS_13MessagingPortE+0x411) [0xabe091]
/lib64/libpthread.so.0 [0x35c8e06367]
/lib64/libc.so.6(clone+0x6d) [0x35c82d2f7d]



 Comments   
Comment by Ian Whalen (Inactive) [ 31/Jul/12 ]

@izek, I'm closing this ticket for now, but please reopen with detailed steps to reproduce if you can.

Comment by siddharth.singh@10gen.com [ 25/Jul/12 ]

Hi,

I tried running mapreduce in parallel but could not repro the crash. Can you please post the steps to reproduce this.

Thanks.

Comment by izek greenfield [ 24/Jul/12 ]

sample doc:
{
"_id" : ObjectId("500d1aa0c52b935b44ebf6ee"),
"compInstallPath" : "/opt/orland/auditServer",
"compInstanceName" : "ppsInstance1",
"compName" : "pps",
"compVersion" : "1.0.0-1",
"flowContext" : "fx1234",
"name" : "kuku",
"elapsedTime" : NumberLong(17),
"instanceId" : NumberLong(-903099689),
"recordTime" : ISODate("2012-07-23T09:34:24.418Z"),
"day_id" : ISODate("2012-07-22T21:00:00Z"),
"hour_id" : ISODate("2012-07-23T09:00:00Z"),
"minute_id" : ISODate("2012-07-23T09:34:00Z"),
"second_id" : ISODate("2012-07-23T09:34:24Z"),
"distinctContentID" : "c1",
"status" : 303,
"distinctContentInstanceID" : "c3",
"householdId" : "householdId-1",
"authorizationId" : "abcd-1",
"lastTime" : "2011-09-02T13:12:59Z",
"profile" : "abe"
}

commands:
{
"mapreduce" : "raw_data",
"out" :

{ "merge" : "secondid_instanceid_name_status", "db" : "secondsDatabase" }

,
"query" : { "second_id" :

{ $gte: new Date(1343109120000) }

, $and: [ { "second_id" :

{ $lt: new Date(1343109180000) }

} ] },
"map" : "function Map() { emit(

{time_id : this.second_id, instance_id : this.instanceId, name : this.name}

,

{count: 1, elapsedTime: this.elapsedTime, status : this.status, failedCount: 1 , sucssesCount: 1}

); }",
"reduce" : "function Reduce(key, values) { var reduced =

{count:0, elapsedTime:0}

; values.forEach(function(val)

{ reduced.elapsedTime += val.elapsedTime; reduced.count += val.count; }

); return reduced; }",
"finalize" : "function Finalize(key, reduced)

{ reduced.avgElapsedTime = reduced.elapsedTime / reduced.count; return reduced;}

",
"sort" :

{ "second_id" : 1 , "instanceId" : 1, "name" : 1 , "status" : 1}

}

on my env it happen each time i start it. it happen when i start with empty collection immediatly if the collection has data it takes more time.

Comment by Antoine Girbal [ 24/Jul/12 ]

is this reproducible always?
Could you give us a sample document and the exact command ran?
thanks

Generated at Thu Feb 08 03:12:03 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.