[SERVER-1602] mongos crashes when multiple map reduce jobs run in parallel Created: 09/Aug/10  Updated: 07/Apr/23  Resolved: 08/Sep/10

Status: Closed
Project: Core Server
Component/s: Sharding
Affects Version/s: 1.6.0
Fix Version/s: 1.6.3, 1.7.1

Type: Bug Priority: Major - P3
Reporter: Onur Cakmak Assignee: Mathias Stearn
Resolution: Done Votes: 2
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Two identical servers with:

  • Ubuntu 9.10 64 bit (Linux vs8 2.6.31-22-server #60-Ubuntu SMP Thu May 27 03:42:09 UTC 2010 x86_64 GNU/Linux)
  • Running on Dell 1950, with 24G ram and, Dual Intel Xeon L5335

It's a simple sharded setup. mongos, and the config server is running on the first server, and the second one only runs a mongodb instance.


Attachments: Text File 1.txt     Text File 2.txt     Zip Archive MongoDB-SERVER-1602-Example.zip    
Operating System: ALL
Participants:

 Description   

mongos process crashes if we run multiple map reduce jobs in parallel. It works properly if those jobs are ran serially.

The issue is not intermittent. mongos will crash every time we try to execute the jobs in parallel.

Our flow is:

  • Generate the selector/query A
  • Run 3 map reduce jobs with query A
  • Run a query with find().skip().limit().sort() with query A and sort by B
  • Run a count() query with query A

we run these steps in parallel with 5 threads.

Map reduce jobs are meant to generate a simple groupby/count resultset.

map fn: function() {emit(this.FIELD,

{count:1}

);}; /* FIELD is: Make, Year, VehicleType for different jobs */
reduce fn: function(key, vals) {var t = 0; for (var i = 0; i < vals.length; i++) t += vals[i].count; return

{count:t}

;};



 Comments   
Comment by auto [ 18/Sep/10 ]

Author:

{'login': 'erh', 'name': 'Eliot Horowitz', 'email': 'eliot@10gen.com'}

Message: fix Future SERVER-1602
http://github.com/mongodb/mongo/commit/268a8a39dfa9ddb0401c3ab82c01ff01d28cee86

Comment by Mathias Stearn [ 08/Sep/10 ]

I think I've found the issue, please try again with tomorrows (1.7.x) nightly and reopen if this still occurs.

Thanks for providing the assert and the backtrace. They made this much easier to find.

Comment by auto [ 08/Sep/10 ]

Author:

{'login': 'RedBeard0531', 'name': 'Mathias Stearn', 'email': 'mathias@10gen.com'}

Message: get rid of last _grab SERVER-1602
http://github.com/mongodb/mongo/commit/323cb15d9a09ed8c238ee9e246455447cd923c24

Comment by Mike Richmond [ 03/Sep/10 ]

I am running into this same issue.

Using latest 1.6.2 release and connecting to a mongos process via nodejs using this driver: http://github.com/christkv/node-mongodb-native

My queries do not run in separate threads but instead I have 7 processes all executing map reduce queries at the same time. Each is running a map reduce query on a separate database.

Here are the map / reduce functions:
var map = function() { emit(

{ id: this.id }

, 1 ) }
var reduce = function(k,vals) { var count = 0; vals.forEach( function(v)

{ count += v; }

); return count; }

I ran the mongos server with -vvvv and the only suspicious output I could find was this line immediately before the crash (was last line in log):
Thu Sep 2 20:08:14 Assertion failure _grab client/parallel.cpp 461

Also worth noting is that on the client side log I see this cryptic memory dump sometimes (but mongos does not always crash when this is output):

0x4fc751 0x509a2f 0x55f3fa 0x674390 0x2aaaaaccd2f7 0x2aaaab744e3d
/usr/local/mongo/bin/mongos(_ZN5mongo12sayDbContextEPKc+0xb1) [0x4fc751]
/usr/local/mongo/bin/mongos(_ZN5mongo8assertedEPKcS1_j+0x10f) [0x509a2f]
/usr/local/mongo/bin/mongos(_ZN5mongo6Future13commandThreadEv+0x12a) [0x55f3fa]
/usr/local/mongo/bin/mongos(thread_proxy+0x80) [0x674390]
/lib64/libpthread.so.0 [0x2aaaaaccd2f7]
/lib64/libc.so.6(clone+0x6d) [0x2aaaab744e3d]

Comment by Mathias Stearn [ 31/Aug/10 ]

are you still having this issue?

Comment by Mathias Stearn [ 10/Aug/10 ]

I've only been able to reproduce using a V8 build which has known memory leaks. Could you try using a spidermonkey build of mongod?

Comment by Onur Cakmak [ 10/Aug/10 ]

Attached an example database population script, and a multi threaded client in c#, along with a compiled binary, in case you don't have a compiler

Turns out only 2 threads can crash mongos.

Generated at Thu Feb 08 02:57:29 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.