|
This ticket branches off from SERVER-13442.
We noticed performance impact with multiple concurrent map/reduce jobs running, outputting into the same collection. The more jobs, the slower they got.
On SERVER-13442, rui.zhang determined that it was mostly a problem on Mac OS X.
A single map/reduce job in my test takes 9 seconds:
Thu Apr 3 11:22:40.114 [conn25] command test.$cmd command: { mapReduce: "docs", map: function () {
|
emit(this.office, this.age);
|
..., reduce: function (office, ages) {
|
return Array.sum(a..., out: { replace: "result", db: "test" } } ntoreturn:1 keyUpdates:0 numYields: 10008 locks(micros) W:2742 r:16146697 w:987 reslen:160 9068ms
|
But when running 4 identical map/reduce jobs concurrently, with the same output collection, each of them takes 90 seconds:
Thu Apr 3 11:21:52.854 [conn22] command test.$cmd command: { mapReduce: "docs", map: function () {
|
emit(this.office, this.age);
|
..., reduce: function (office, ages) {
|
return Array.sum(a..., out: { replace: "result", db: "test" } } ntoreturn:1 keyUpdates:0 numYields: 10068 locks(micros) W:958 r:158963049 w:2542 reslen:160 91595ms
|
My test was on a MacBook Pro, with SSD, 16GB RAM. 1 million documents of this schema with 4 different offices.
{
|
"_id" : ObjectId("533c4da2fe4dce47b0a093e8"),
|
"age" : 24,
|
"name" : "LqiU3l0hAM",
|
"office" : "dublin"
|
}
|
Map/Reduce functions:
var mapFunction = function() {
|
emit(this.office, this.age);
|
};
|
|
var reduceFunction = function(office, ages) {
|
return Array.sum(ages);
|
};
|
|
db.runCommand({mapReduce: 'docs', map: mapFunction, reduce: reduceFunction, out: {reduce: "result", db: "test"}})
|
|
printjson(db.result.find().toArray());
|
Then I just started several of these on the shell, with
mongo mr.js & mongo mr.js & mongo mr.js ...
|
|