[SERVER-2333] mapreduce optimization: do not execute reduce on unique keys Created: 05/Jan/11 Updated: 12/Jul/16 Resolved: 25/Jan/11 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | 1.7.5 |
| Type: | Improvement | Priority: | Major - P3 |
| Reporter: | Antoine Girbal | Assignee: | Antoine Girbal |
| Resolution: | Done | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||
| Participants: | |||||||||
| Description |
|
By accident I had a wrong reduce method: reduce = function(key, vals) { var sum = 0; for (var val in vals) { sum += val; }return sum; } The rows in collection actually have only 1 entry per this.ln. If output goes to collection, it's ) { "_id" : "zzucdarlws", "value" : "00" }It looks like for inline, the reduce function is never called, whereas it's called once for the collection. |
| Comments |
| Comment by Brian Johnson [ 10/May/12 ] |
|
I filed https://jira.mongodb.org/browse/SERVER-5818 because I think this "fix" only applies to a very limited set of use cases. For instance, it would be a problem if you were summing values per key. If you only had a single key, you wouldn't get an aggregated count. |
| Comment by auto [ 24/Jan/11 ] |
|
Author: {u'login': u'agirbal', u'name': u'agirbal', u'email': u'antoine@10gen.com'}Message: |
| Comment by auto [ 24/Jan/11 ] |
|
Author: {u'login': u'agirbal', u'name': u'agirbal', u'email': u'antoine@10gen.com'}Message: Added many comments all around mr.cpp |
| Comment by Antoine Girbal [ 24/Jan/11 ] |
|
Fixed this by not applying reduce() in case there is only 1 object, even for output to collection. |