[SERVER-16713] mapReduce verify() failure with multiple null keys: all.size() == 1 Created: 04/Jan/15  Updated: 06/Dec/22  Resolved: 09/Mar/20

Status: Closed
Project: Core Server
Component/s: MapReduce
Affects Version/s: 2.2.7, 2.8.0-rc4
Fix Version/s: 4.4.0

Type: Bug Priority: Major - P3
Reporter: Kamran K. Assignee: Backlog - Query Team (Inactive)
Resolution: Done Votes: 0
Labels: 28qa, query-44-grooming
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
is related to SERVER-2861 MapReduce returns undefined _id Closed
Assigned Teams:
Query
Backwards Compatibility: Fully Compatible
Operating System: ALL
Steps To Reproduce:

var t = db.foo;
t.drop();
 
t.insert({a: null});
t.insert({b: 1});
t.insert({b: 1}); // doesn't repro without this extra insert
 
var mapper = function() {
    emit(this.a, 1);
};
 
var reducer = function(k, v) {
    return Array.sum(v);
};
 
t.mapReduce(mapper, reducer, {out: {inline: true}});

Participants:

 Description   

I was able to reproduce this bug with master and 2.2.7 (but didn't test further back than that).

Assertion failure all.size() == 1 src/mongo/db/commands/mr.cpp 512
 
#0  0x00007ffff7bcc20b in raise (sig=5) at ../nptl/sysdeps/unix/sysv/linux/pt-raise.c:37
#1  0x00000000018a6003 in mongo::mongo_breakpoint () at src/mongo/util/debug_util.cpp:58
#2  0x000000000189c7f8 in mongo::breakpoint () at src/mongo/util/debug_util.h:73
#3  0x000000000189b454 in mongo::verifyFailed (expr=0x1f69337 "all.size() == 1", file=0x1f68f8a "src/mongo/db/commands/mr.cpp", line=512) at src/mongo/util/assert_util.cpp:134
#4  0x00000000012bcf83 in mongo::mr::State::appendResults (this=0x7ffff7fca1c0, final=...) at src/mongo/db/commands/mr.cpp:512
#5  0x00000000012c3910 in mongo::mr::MapReduceCommand::run (this=0x28396c0 <mongo::mr::mapReduceCommand>, txn=0x7ffff7fcb7b0, dbname=..., cmd=..., errmsg=..., result=..., fromRepl=false)
    at src/mongo/db/commands/mr.cpp:1465
#6  0x0000000001323415 in mongo::_execCommand (txn=0x7ffff7fcb7b0, c=0x28396c0 <mongo::mr::mapReduceCommand>, dbname=..., cmdObj=..., queryOptions=0, errmsg=..., result=..., fromRepl=false)
    at src/mongo/db/dbcommands.cpp:1256
#7  0x0000000001324392 in mongo::Command::execCommand (txn=0x7ffff7fcb7b0, c=0x28396c0 <mongo::mr::mapReduceCommand>, queryOptions=0, cmdns=0x33b6414 "test.$cmd", cmdObj=..., result=..., fromRepl=false)
    at src/mongo/db/dbcommands.cpp:1472
#8  0x0000000001324c74 in mongo::_runCommands (txn=0x7ffff7fcb7b0, ns=0x33b6414 "test.$cmd", _cmdobj=..., b=..., anObjBuilder=..., fromRepl=false, queryOptions=0) at src/mongo/db/dbcommands.cpp:1547
#9  0x00000000015272b4 in mongo::runCommands (txn=0x7ffff7fcb7b0, ns=0x33b6414 "test.$cmd", jsobj=..., curop=..., b=..., anObjBuilder=..., fromRepl=false, queryOptions=0) at src/mongo/db/query/find.cpp:131
#10 0x0000000001529040 in mongo::runQuery (txn=0x7ffff7fcb7b0, m=..., q=..., curop=..., result=..., fromDBDirectClient=false) at src/mongo/db/query/find.cpp:565
#11 0x000000000142f0ed in mongo::receivedQuery (txn=0x7ffff7fcb7b0, c=..., dbresponse=..., m=..., fromDBDirectClient=false) at src/mongo/db/instance.cpp:224
#12 0x00000000014301ff in mongo::assembleResponse (txn=0x7ffff7fcb7b0, m=..., dbresponse=..., remote=..., fromDBDirectClient=false) at src/mongo/db/instance.cpp:394
#13 0x000000000113310a in mongo::MyMessageHandler::process (this=0x307c1e8, m=..., port=0x30ab5c0, le=0x33c01e0) at src/mongo/db/db.cpp:195
#14 0x00000000018c3949 in mongo::PortMessageServer::handleIncomingMsg (arg=0x30942b0) at src/mongo/util/net/message_server_port.cpp:234
#15 0x00007ffff7bc4182 in start_thread (arg=0x7ffff7fcc700) at pthread_create.c:312
#16 0x00007ffff6cc4efd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111



 Comments   
Comment by Charlie Swanson [ 09/Mar/20 ]

I have verified this is fixed on master after our recent project to reimplement mapReduce on top of the aggregation framework.

Comment by Charlie Swanson [ 06/Jan/15 ]

So I've tracked down this issue. It's because of the different ways we treat null and undefined. Here's what happens:
1. During the map phase, the {a: null} doc emits null as the key, and 1 as the value, which is internally represented as a BSONObj {"0": null, "1": 1.0}. The {b: 1} docs emit undefined as the key, which results in the BSONObj {"": null, "1": 1.0} (Note "" instead of "0" for the first key). This behavior seems to be purposefully introduced in SERVER-2861.
2. During the reduce phase, we combine the two {"": null, "1": 1.0} objects into one {"0": null, "1": 2.0} object. Note the 0 is back (see code for how here). Then both this doc and the the {"0": null, "1": 1.0} from the {a: null} doc are added to a map, resulting in {"0": null} mapping to [ {"0": null, "1": 1.0}, {"0": null, "1": 2.0}].
3. During the conversion to printing (since it's not going into another collection), there is an assert that each key maps to only one value, since all but undefined keys would not change during the reduce stage. If writing to a collection, it looks like there is an extra stage of reducing when writing to the collection, resolving this discrepancy.

It looks like if we change this line to append the null with a "0" instead of a "", the early distinction of null and undefined will go away. I don't know enough about the map reduce code/desired behavior to know if this is a good idea though. Anyone care to weigh in?

Edit: I changed the above mentioned line to be a 0 instead of empty string and ran tests (unit, db, and jstest core) and it still works. Manually confirmed it fixes this case, going to work on adding a regression test for this case.

Generated at Thu Feb 08 03:42:01 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.