[SERVER-22922] mapReduce return incorrect result Created: 02/Mar/16  Updated: 08/Mar/16  Resolved: 07/Mar/16

Status: Closed
Project: Core Server
Component/s: MapReduce
Affects Version/s: 3.2.1
Fix Version/s: None

Type: Bug Priority: Critical - P2
Reporter: Oded Shafran Assignee: Kelsey Schubert
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Operating System: ALL
Participants:

 Description   

I have the following code (in ruby using mongoid)

map = %Q{
      function() {      
          emit(1000, {count: this.am});
      }
    }
 
      reduce = %Q{
      function(key, values) {
 
        var count = 0;
        values.forEach(function(value) {
          count++;
        });
 
        return { count: count};
      }
    }

Problem is, that count == 2. while I have a couple of hundreds of documents



 Comments   
Comment by Oded Shafran [ 08/Mar/16 ]

Got it. thanks

Comment by Asya Kamsky [ 07/Mar/16 ]

I think you misunderstand.

reducedVal.count += countObjVals[idx].count;

works fine because it's incrementing "running count" by new value.

What your code does is

 count++; 

which is equivalent to always incrementing by 1, but count you get in reduce may or may not be 1.

Comment by Oded Shafran [ 07/Mar/16 ]

Thanks. So you are saying that it is inevitable always to check if the "field" in the emitter exists, by doing that - understanding if i'm being pushed from a previous call.
Since the behavior was not consistent, I thought this is a bug.
Thanks for clarifying!

Comment by Ramon Fernandez Marina [ 07/Mar/16 ]

odedshafran, please take a look at SERVER-16045 for a similar case, hope that helps.

Comment by Oded Shafran [ 07/Mar/16 ]

Thank you @Thomas.
However, in your examples in your site you do say that, but when you do FOR on the VALUES - you consider the value as a single variable:

var reduceFunction2 = function(keySKU, countObjVals) {
reducedVal =

{ count: 0, qty: 0 }

;

for (var idx = 0; idx < countObjVals.length; idx++)

{ reducedVal.count += countObjVals[idx].count; reducedVal.qty += countObjVals[idx].qty; }

return reducedVal;
};

countObjVals is always NOT an array. how is that settled with what you've written?

Thank you

Comment by Kelsey Schubert [ 07/Mar/16 ]

Hi odedshafran,

The reduce function described does not meet the necessary requirements. In particular, please note the following behavior:

MongoDB can invoke the reduce function more than once for the same key. In this case, the previous output from the reduce function for that key will become one of the input values to the next reduce function invocation for that key.

If you continue to run into issues with your reduce function, I would recommend posting on the mongodb-users group or Stack Overflow with the mongodb tag. A question like this involving more discussion would be best posted on the mongodb-users group.

Thank you,
Thomas

Comment by Oded Shafran [ 02/Mar/16 ]

I have this collection sharded and in replica set. if it makes any difference. on my local machine it is not reproduceable

Generated at Thu Feb 08 04:01:50 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.