[SERVER-13490] concurrent map/reduce jobs slowed down significantly on OS X Created: 04/Apr/14  Updated: 06/Dec/22  Resolved: 01/Feb/16

Status: Closed
Project: Core Server
Component/s: MapReduce
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Thomas Rueckstiess Assignee: Backlog - Storage Execution Team
Resolution: Won't Fix Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
is related to SERVER-13442 mapReduce nonAtomic output option Closed
Assigned Teams:
Storage Execution
Operating System: OS X
Participants:

 Description   

This ticket branches off from SERVER-13442.

We noticed performance impact with multiple concurrent map/reduce jobs running, outputting into the same collection. The more jobs, the slower they got.

On SERVER-13442, rui.zhang determined that it was mostly a problem on Mac OS X.

A single map/reduce job in my test takes 9 seconds:

Thu Apr  3 11:22:40.114 [conn25] command test.$cmd command: { mapReduce: "docs", map: function () {
                       emit(this.office, this.age);
    ..., reduce: function (office, ages) {
                          return Array.sum(a..., out: { replace: "result", db: "test" } } ntoreturn:1 keyUpdates:0 numYields: 10008 locks(micros) W:2742 r:16146697 w:987 reslen:160 9068ms

But when running 4 identical map/reduce jobs concurrently, with the same output collection, each of them takes 90 seconds:

Thu Apr  3 11:21:52.854 [conn22] command test.$cmd command: { mapReduce: "docs", map: function () {
                       emit(this.office, this.age);
    ..., reduce: function (office, ages) {
                          return Array.sum(a..., out: { replace: "result", db: "test" } } ntoreturn:1 keyUpdates:0 numYields: 10068 locks(micros) W:958 r:158963049 w:2542 reslen:160 91595ms

My test was on a MacBook Pro, with SSD, 16GB RAM. 1 million documents of this schema with 4 different offices.

{
    "_id" : ObjectId("533c4da2fe4dce47b0a093e8"),
    "age" : 24,
    "name" : "LqiU3l0hAM",
    "office" : "dublin"
}

Map/Reduce functions:

var mapFunction = function() {
                       emit(this.office, this.age);
                   };
 
var reduceFunction = function(office, ages) {
                          return Array.sum(ages);
                      };
 
db.runCommand({mapReduce: 'docs', map: mapFunction, reduce: reduceFunction, out: {reduce: "result", db: "test"}})
 
printjson(db.result.find().toArray());

Then I just started several of these on the shell, with

mongo mr.js & mongo mr.js & mongo mr.js ...


Generated at Thu Feb 08 03:31:53 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.