[SERVER-28321] In mapReduce map function emitting a document with 'undefined' or 'null' key in sharded collection fails Created: 15/Mar/17  Updated: 17/Apr/17  Resolved: 24/Mar/17

Status: Closed
Project: Core Server
Component/s: MapReduce, Querying
Affects Version/s: 3.5.4
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Eddie Louie Assignee: Tess Avitabile (Inactive)
Resolution: Duplicate Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Duplicate
duplicates SERVER-14324 MapReduce does not respect existing s... Closed
Related
is related to SERVER-26315 sharded_collections_jscore_passthroug... Closed
Operating System: ALL
Steps To Reproduce:

This test is modified from the mr_undef.js file.

t = db.mr_undef;
t.drop();
 
outname = "mr_undef_out";
out = db[outname];
out.drop();
 
t.insert({x: 0});
 
var m = function() {
    emit(undefined, this.x);
};
var r = function(k, v) {
    total = 0;
    for (i in v) {
        total += v[i];
    }
    return total;
};
 
res = t.mapReduce(m, r,  {out: outname});
printjson(res);
printjson(out.find().toArray());
assert.eq(1, out.find({_id: {$type: 10}}).itcount(), "A2");

Participants:

 Description   

This fails with the code modifications to implicitly_shard_accessed_collections.js introduced by SERVER-26315. In that code, we are over-riding the DBCollection.prototype.drop() function to re-shard collections after they are dropped, in sharded cluster environments.

Note: This does not fail in non-sharded collections.

t = db.mr_undef;
t.drop();
 
outname = "mr_undef_out";
out = db[outname];
out.drop();
 
t.insert({x: 0});
 
var m = function() {
    emit(undefined, this.x);  // the key can be 'null' also
};
var r = function(k, v) {
    total = 0;
    for (i in v) {
        total += v[i];
    }
    return total;
};
 
res = t.mapReduce(m, r,  {out: outname});
printjson(res);
printjson(out.find().toArray());
assert.eq(1, out.find({_id: {$type: 10}}).itcount(), "A2");

It fails because there are no documents in the out collection even though the res object shows there was an output document.

[js_test:mr_undef] 2017-03-15T03:59:11.699-0400 {       
[js_test:mr_undef] 2017-03-15T03:59:11.699-0400         "result" : "mr_undef_out",
[js_test:mr_undef] 2017-03-15T03:59:11.699-0400         "counts" : {
[js_test:mr_undef] 2017-03-15T03:59:11.700-0400                 "input" : NumberLong(1),
[js_test:mr_undef] 2017-03-15T03:59:11.700-0400                 "emit" : NumberLong(1),
[js_test:mr_undef] 2017-03-15T03:59:11.700-0400                 "reduce" : NumberLong(0),
[js_test:mr_undef] 2017-03-15T03:59:11.700-0400                 "output" : NumberLong(0)
[js_test:mr_undef] 2017-03-15T03:59:11.700-0400         },
[js_test:mr_undef] 2017-03-15T03:59:11.700-0400         "timeMillis" : 51,
[js_test:mr_undef] 2017-03-15T03:59:11.700-0400         "timing" : {
[js_test:mr_undef] 2017-03-15T03:59:11.700-0400                 "shardProcessing" : 24,
[js_test:mr_undef] 2017-03-15T03:59:11.700-0400                 "postProcessing" : 26
[js_test:mr_undef] 2017-03-15T03:59:11.701-0400         },
[js_test:mr_undef] 2017-03-15T03:59:11.701-0400         "shardCounts" : {   
[js_test:mr_undef] 2017-03-15T03:59:11.701-0400                 "huracan:20003" : {
[js_test:mr_undef] 2017-03-15T03:59:11.702-0400                         "input" : 1, 
[js_test:mr_undef] 2017-03-15T03:59:11.702-0400                         "emit" : 1,
[js_test:mr_undef] 2017-03-15T03:59:11.702-0400                         "reduce" : 0,
[js_test:mr_undef] 2017-03-15T03:59:11.702-0400                         "output" : 1     <---------------this suggests there was an output document.
[js_test:mr_undef] 2017-03-15T03:59:11.703-0400                 }
[js_test:mr_undef] 2017-03-15T03:59:11.703-0400         },
[js_test:mr_undef] 2017-03-15T03:59:11.703-0400         "postProcessCounts" : { 
[js_test:mr_undef] 2017-03-15T03:59:11.703-0400                 "huracan:20003" : {
[js_test:mr_undef] 2017-03-15T03:59:11.703-0400                         "input" : NumberLong(0),
[js_test:mr_undef] 2017-03-15T03:59:11.703-0400                         "reduce" : NumberLong(0),
[js_test:mr_undef] 2017-03-15T03:59:11.704-0400                         "output" : NumberLong(0)
[js_test:mr_undef] 2017-03-15T03:59:11.704-0400                 }
[js_test:mr_undef] 2017-03-15T03:59:11.704-0400         },
[js_test:mr_undef] 2017-03-15T03:59:11.704-0400         "ok" : 1
[js_test:mr_undef] 2017-03-15T03:59:11.704-0400 }
[js_test:mr_undef] 2017-03-15T03:59:11.704-0400 [ ]    <------- this is from the "printjson(out.find().toArray());" which shows no output document.
....
[js_test:mr_undef] 2017-03-15T03:59:11.707-0400 assert: [1] != [0] are not equal : A2
[js_test:mr_undef] 2017-03-15T03:59:11.707-0400 doassert@src/mongo/shell/assert.js:18:14
[js_test:mr_undef] 2017-03-15T03:59:11.707-0400 assert.eq@src/mongo/shell/assert.js:54:5
[js_test:mr_undef] 2017-03-15T03:59:11.707-0400 @jstests/core/mr_undef.js:26:1
[js_test:mr_undef] 2017-03-15T03:59:11.707-0400
[js_test:mr_undef] 2017-03-15T03:59:11.707-0400 2017-03-15T03:59:11.707-0400 E QUERY    [thread1] Error: [1] != [0] are not equal : A2 :
[js_test:mr_undef] 2017-03-15T03:59:11.708-0400 doassert@src/mongo/shell/assert.js:18:14
[js_test:mr_undef] 2017-03-15T03:59:11.708-0400 assert.eq@src/mongo/shell/assert.js:54:5
[js_test:mr_undef] 2017-03-15T03:59:11.708-0400 @jstests/core/mr_undef.js:26:1
[js_test:mr_undef] 2017-03-15T03:59:11.708-0400 failed to load: jstests/core/mr_undef.js



 Comments   
Comment by Eddie Louie [ 24/Mar/17 ]

Done. Comments changed.

Comment by Tess Avitabile (Inactive) [ 24/Mar/17 ]

I would recommend changing the comment, however.

Comment by Eddie Louie [ 24/Mar/17 ]

Actually, I forgot I've already blacklisted this file from the sharded_collections_jscore_passthrough suite. So no work needs to be done.

Comment by Tess Avitabile (Inactive) [ 24/Mar/17 ]

Great, I'll close this as a duplicate of SERVER-14324.

Comment by Eddie Louie [ 24/Mar/17 ]

Thanks tess.avitabile. I've opened TIG-514 to blacklist the mr_undef.js test. I'll mark that ticket as related on SERVER-14324.

Comment by Tess Avitabile (Inactive) [ 24/Mar/17 ]

It looks like mr_undef.js fails under sharded_collections_jscore_passthrough because this passthrough shards the out collection by {_id: "hashed"}. mapReduce is known to silently behave wrong when the output collection is sharded on a key other than {_id: 1} (SERVER-14324). This occurs even when the _id of emitted documents is not undefined or null. I think it probably makes sense for this test to remain blacklisted under this suite, since we don't expect it to behave correctly.

Generated at Thu Feb 08 04:17:47 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.