[SERVER-4382] MR becomes very slow if it keeps reducing a very large object Created: 29/Nov/11  Updated: 11/Jul/16  Resolved: 01/Dec/11

Status: Closed
Project: Core Server
Component/s: MapReduce
Affects Version/s: 2.0.1
Fix Version/s: 2.1.0

Type: Improvement Priority: Major - P3
Reporter: Antoine Girbal Assignee: Antoine Girbal
Resolution: Done Votes: 1
Labels: rn
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
Participants:

 Description   

If many values are emitted for same key, and object keeps growing, MR does not flush it to disk until it reaches a threshold size.
But a large object becomes very slow to handle for JS and it may use much more memory than we think it's using.
It triggers many reduce steps and potential GC.
Example is:

map = function() {
  emit(this.full_name, this._id);
}
 
reduce = function(k,vals) {
     var tmp = {};
     vals.forEach(function(i) {
        if(typeof(i) == 'string') {
          tmp[i] = true;
        } else {
          for(var z in i) tmp[z] = true;
        }
     });
     return tmp;
}

Against a collection with 1m docs like:

{
        "_id" : {__rand: "str", len: 20},
        "soc_id" : {__rand: "str", len: 10},
        "exp" : {__rand: "int", min: 0, max: 100000000},
        "full_name" : "Natalya",
        "last_entrance" : 1321935873,
        "score" : 5000
}



 Comments   
Comment by auto [ 30/Nov/11 ]

Author:

{u'login': u'agirbal', u'name': u'agirbal', u'email': u'antoine@10gen.com'}

Message: SERVER-4382: added timing test for MR
Branch: master
https://github.com/mongodb/mongo/commit/a1657326bf02c3eabd6d21f2a8815499efc6137e

Comment by auto [ 29/Nov/11 ]

Author:

{u'login': u'agirbal', u'name': u'agirbal', u'email': u'antoine@10gen.com'}

Message: SERVER-4382: MR becomes very slow if it keeps reducing a very large object
Branch: master
https://github.com/mongodb/mongo/commit/ecf21615b55880023fcae1f3cde6059b685899b9

Generated at Thu Feb 08 03:05:49 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.