[SERVER-1694] Corruption during mapreduce on documents with arrays of binary data. Created: 27/Aug/10  Updated: 12/Jul/16  Resolved: 27/Aug/10

Status: Closed
Project: Core Server
Component/s: Internal Client, JavaScript, Shell
Affects Version/s: 1.6.1
Fix Version/s: 1.7.0

Type: Bug Priority: Critical - P2
Reporter: gvs Assignee: Eliot Horowitz (Inactive)
Resolution: Done Votes: 1
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

cdc$ ./mongo --version
MongoDB shell version: 1.6.1

cdc$ ./mongod --version
db version v1.6.1, pdfile version 4.5
Fri Aug 27 13:26:03 git version: c5f5f9a4f3b515dfd5272d373093fd4fd58c95d9

cdc$ ./mongos --version
Fri Aug 27 13:26:25 ./mongos db version v1.6.1, pdfile version 4.5 starting (--help for usage)
Fri Aug 27 13:26:25 git version: c5f5f9a4f3b515dfd5272d373093fd4fd58c95d9
Fri Aug 27 13:26:25 sys info: Linux cdc 2.6.34-rc6-cdc-00063-gbe1066b #32 SMP Sat May 1 02:07:17 EDT 2010 x86_64 BOOST_LIB_VERSION=1_42

cdc$ gcc -v
Using built-in specs.
Target: x86_64-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Debian 4.4.4-7' --with-bugurl=file:///usr/share/doc/gcc-4.4/README.Bugs --enable-languages=c,c+,fortran,objc,obj-c+ --prefix=/usr --enable-shared --enable-multiarch --enable-linker-build-id --with-system-zlib --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --with-gxx-include-dir=/usr/include/c++/4.4 --program-suffix=-4.4 --enable-nls --enable-clocale=gnu --enable-libstdcxx-debug --enable-objc-gc --with-arch-32=i586 --with-tune=generic --enable-checking=release --build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=x86_64-linux-gnu
Thread model: posix
gcc version 4.4.4 (Debian 4.4.4-7)

We're running two shards (without replica sets), a config server and a mongos. The corruption occurs on a non-sharded db.


Attachments: File mr.cpp    
Operating System: Linux
Participants:

 Description   

Running mapreduce on an input collection that only has 2 different keys (0x00010203 and 0xaabbccdd) results in random keys in the output collection:

cdc$ ./mr
{ "result" : "out", "timeMillis" : 9, "counts" :

{ "input" : 20, "emit" : 40, "output" : 5 }

, "ok" : 1 }
key: 00102030
key: 16000000
key: 337f0000
key: 80ad142c
key: aabbccdd
cdc$ ./mr
{ "result" : "out", "timeMillis" : 9, "counts" :

{ "input" : 20, "emit" : 40, "output" : 6 }

, "ok" : 1 }
key: 00000000
key: 00102030
key: 16000000
key: 337f0000
key: 80bc0f2c
key: aabbccdd
cdc$ ./mr
{ "result" : "out", "timeMillis" : 9, "counts" :

{ "input" : 20, "emit" : 40, "output" : 6 }

, "ok" : 1 }
key: 00102030
key: 16000000
key: 337f0000
key: 7071122c
key: 803c122c
key: aabbccdd
cdc$

Steps to reproduce:

1) insert a few documents with arrays of binary data
2) run mapreduce that emits bindata from said arrays as keys
3) notice more keys in output collection than exist in input collection, some are correct, some are random

I'm quite sure the corruption happens during the mapreduce operation, the bindata arrays are stored correctly in the db itself. However, the mongo js shell has some trouble interpreting it consistently:

> db.in.findOne(

{i : 0}

,

{k : 1}

)
{
"_id" : ObjectId("4c77f6c1ec4e2dee21a74d9a"),
"k" : [
BinData(0,"/38AAA=="),
BinData(0,"YLzjAQ==")
]
}
> db.in.findOne(

{i : 0}

,

{k : 1}

)
{
"_id" : ObjectId("4c77f6c1ec4e2dee21a74d9a"),
"k" : [
BinData(0,"AAAAAA=="),
BinData(0,"awDM3Q==")
]
}

Again, when accessing the k array via the mongoclient library, the data is NOT corrupted, this bug in the js client could be completely unrelated.

I've attached mr.cpp that reproduces the behaviour.



 Comments   
Comment by auto [ 15/Sep/10 ]

Author:

{'login': 'erh', 'name': 'Eliot Horowitz', 'email': 'eliot@10gen.com'}

Message: have to copy BinData in sm in case BSONObj is temp SERVER-1694
http://github.com/mongodb/mongo/commit/ba964b908b2180d7a0dae7286ea9a8c6a68b5276

Comment by auto [ 27/Aug/10 ]

Author:

{'login': 'erh', 'name': 'Eliot Horowitz', 'email': 'eliot@10gen.com'}

Message: have to copy BinData in sm in case BSONObj is temp SERVER-1694
http://github.com/mongodb/mongo/commit/449fbabe26b62250b4974aaec2633cb591e36f28

Comment by Eliot Horowitz (Inactive) [ 27/Aug/10 ]

Ok - found the issue.
Thanks for the good report.

Comment by gvs [ 27/Aug/10 ]

It's happening a lot less with the precompiled binaries than with the git sources though. Should we try compiling with a different JS engine (like v8)?

Comment by Eliot Horowitz (Inactive) [ 27/Aug/10 ]

Ok - about 1/25 times i'm getting a weird key.

Comment by gvs [ 27/Aug/10 ]

Have you tried running it multiple times?

I downloaded http://fastdl.mongodb.org/linux/mongodb-linux-x86_64-1.6.1.tgz and used those binaries. I'm still linking with the 1.6.1 client library from git though:

cdc$ ./mr
{ "result" : "out", "timeMillis" : 5, "counts" :

{ "input" : 20, "emit" : 40, "output" : 3 }

, "ok" : 1 }
key: 00102030
key: aabbccdd
key: b97f0000
cdc$ ./mr
{ "result" : "out", "timeMillis" : 11, "counts" :

{ "input" : 20, "emit" : 40, "output" : 2 }

, "ok" : 1 }
key: 00102030
key: aabbccdd
cdc$ ./mr
{ "result" : "out", "timeMillis" : 11, "counts" :

{ "input" : 20, "emit" : 40, "output" : 3 }

, "ok" : 1 }
key: 00102030
key: aabbccdd
key: b97f0000
cdc$ ./mr
{ "result" : "out", "timeMillis" : 11, "counts" :

{ "input" : 20, "emit" : 40, "output" : 2 }

, "ok" : 1 }
key: 00102030
key: aabbccdd
cdc$ pwd
/home/gvs/ziac/cluster
cdc$ ./mr
{ "result" : "out", "timeMillis" : 11, "counts" :

{ "input" : 20, "emit" : 40, "output" : 2 }

, "ok" : 1 }
key: 00102030
key: aabbccdd
cdc$ ./mr
{ "result" : "out", "timeMillis" : 11, "counts" :

{ "input" : 20, "emit" : 40, "output" : 2 }

, "ok" : 1 }
key: 00102030
key: aabbccdd
cdc$ ./mr
{ "result" : "out", "timeMillis" : 11, "counts" :

{ "input" : 20, "emit" : 40, "output" : 2 }

, "ok" : 1 }
key: 00102030
key: aabbccdd
cdc$ ./mr
{ "result" : "out", "timeMillis" : 11, "counts" :

{ "input" : 20, "emit" : 40, "output" : 2 }

, "ok" : 1 }
key: 00102030
key: aabbccdd
cdc$ ./mr
{ "result" : "out", "timeMillis" : 11, "counts" :

{ "input" : 20, "emit" : 40, "output" : 2 }

, "ok" : 1 }
key: 00102030
key: aabbccdd
cdc$ ./mr
{ "result" : "out", "timeMillis" : 11, "counts" :

{ "input" : 20, "emit" : 40, "output" : 2 }

, "ok" : 1 }
key: 00102030
key: aabbccdd
cdc$ ./mr
{ "result" : "out", "timeMillis" : 11, "counts" :

{ "input" : 20, "emit" : 40, "output" : 3 }

, "ok" : 1 }
key: 00102030
key: aabbccdd
key: b97f0000
cdc$ ./mr
{ "result" : "out", "timeMillis" : 12, "counts" :

{ "input" : 20, "emit" : 40, "output" : 3 }

, "ok" : 1 }
key: 00102030
key: aabbccdd
key: b97f0000
cdc$

Comment by Eliot Horowitz (Inactive) [ 27/Aug/10 ]

Running your program I get:

erh@erh-tm1 ~/work/mongo -> g++ -I. mr.cpp -L. -lmongoclient -lpthread -lstdc++ -lboost_system-mt -lboost_thread-mt -lboost_filesystem-mt -lboost_program_options-mt && ./a.out
{ "result" : "out", "timeMillis" : 5, "counts" :

{ "input" : 20, "emit" : 40, "output" : 2 }

, "ok" : 1 }
key: 00102030 190
key: aabbccdd 190

after changing the print line to:

printf("key: %02x%02x%02x%02x %d\n",data[0],data[1],data[2],data[3],o["value"].numberInt());

Can you try this with the official downloads rather than self compiled? Its likely a spidermonkey compilation issue.

Generated at Thu Feb 08 02:57:46 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.