-
Type: Bug
-
Resolution: Done
-
Priority: Critical - P2
-
Affects Version/s: 1.6.1
-
Component/s: Internal Client, JavaScript, Shell
-
None
-
Environment:cdc$ ./mongo --version
MongoDB shell version: 1.6.1
cdc$ ./mongod --version
db version v1.6.1, pdfile version 4.5
Fri Aug 27 13:26:03 git version: c5f5f9a4f3b515dfd5272d373093fd4fd58c95d9
cdc$ ./mongos --version
Fri Aug 27 13:26:25 ./mongos db version v1.6.1, pdfile version 4.5 starting (--help for usage)
Fri Aug 27 13:26:25 git version: c5f5f9a4f3b515dfd5272d373093fd4fd58c95d9
Fri Aug 27 13:26:25 sys info: Linux cdc 2.6.34-rc6-cdc-00063-gbe1066b #32 SMP Sat May 1 02:07:17 EDT 2010 x86_64 BOOST_LIB_VERSION=1_42
cdc$ gcc -v
Using built-in specs.
Target: x86_64-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Debian 4.4.4-7' --with-bugurl=file:///usr/share/doc/gcc-4.4/README.Bugs --enable-languages=c,c++,fortran,objc,obj-c++ --prefix=/usr --enable-shared --enable-multiarch --enable-linker-build-id --with-system-zlib --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --with-gxx-include-dir=/usr/include/c++/4.4 --program-suffix=-4.4 --enable-nls --enable-clocale=gnu --enable-libstdcxx-debug --enable-objc-gc --with-arch-32=i586 --with-tune=generic --enable-checking=release --build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=x86_64-linux-gnu
Thread model: posix
gcc version 4.4.4 (Debian 4.4.4-7)
We're running two shards (without replica sets), a config server and a mongos. The corruption occurs on a non-sharded db.cdc$ ./mongo --version MongoDB shell version: 1.6.1 cdc$ ./mongod --version db version v1.6.1, pdfile version 4.5 Fri Aug 27 13:26:03 git version: c5f5f9a4f3b515dfd5272d373093fd4fd58c95d9 cdc$ ./mongos --version Fri Aug 27 13:26:25 ./mongos db version v1.6.1, pdfile version 4.5 starting (--help for usage) Fri Aug 27 13:26:25 git version: c5f5f9a4f3b515dfd5272d373093fd4fd58c95d9 Fri Aug 27 13:26:25 sys info: Linux cdc 2.6.34-rc6-cdc-00063-gbe1066b #32 SMP Sat May 1 02:07:17 EDT 2010 x86_64 BOOST_LIB_VERSION=1_42 cdc$ gcc -v Using built-in specs. Target: x86_64-linux-gnu Configured with: ../src/configure -v --with-pkgversion='Debian 4.4.4-7' --with-bugurl= file:///usr/share/doc/gcc-4.4/README.Bugs --enable-languages=c,c++,fortran,objc,obj-c++ --prefix=/usr --enable-shared --enable-multiarch --enable-linker-build-id --with-system-zlib --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --with-gxx-include-dir=/usr/include/c++/4.4 --program-suffix=-4.4 --enable-nls --enable-clocale=gnu --enable-libstdcxx-debug --enable-objc-gc --with-arch-32=i586 --with-tune=generic --enable-checking=release --build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=x86_64-linux-gnu Thread model: posix gcc version 4.4.4 (Debian 4.4.4-7) We're running two shards (without replica sets), a config server and a mongos. The corruption occurs on a non-sharded db.
-
Linux
Running mapreduce on an input collection that only has 2 different keys (0x00010203 and 0xaabbccdd) results in random keys in the output collection:
cdc$ ./mr
{ "result" : "out", "timeMillis" : 9, "counts" :
, "ok" : 1 }
key: 00102030
key: 16000000
key: 337f0000
key: 80ad142c
key: aabbccdd
cdc$ ./mr
{ "result" : "out", "timeMillis" : 9, "counts" :
, "ok" : 1 }
key: 00000000
key: 00102030
key: 16000000
key: 337f0000
key: 80bc0f2c
key: aabbccdd
cdc$ ./mr
{ "result" : "out", "timeMillis" : 9, "counts" :
, "ok" : 1 }
key: 00102030
key: 16000000
key: 337f0000
key: 7071122c
key: 803c122c
key: aabbccdd
cdc$
Steps to reproduce:
1) insert a few documents with arrays of binary data
2) run mapreduce that emits bindata from said arrays as keys
3) notice more keys in output collection than exist in input collection, some are correct, some are random
I'm quite sure the corruption happens during the mapreduce operation, the bindata arrays are stored correctly in the db itself. However, the mongo js shell has some trouble interpreting it consistently:
> db.in.findOne(
{i : 0},
{k : 1})
{
"_id" : ObjectId("4c77f6c1ec4e2dee21a74d9a"),
"k" : [
BinData(0,"/38AAA=="),
BinData(0,"YLzjAQ==")
]
}
> db.in.findOne(
,
{k : 1})
{
"_id" : ObjectId("4c77f6c1ec4e2dee21a74d9a"),
"k" : [
BinData(0,"AAAAAA=="),
BinData(0,"awDM3Q==")
]
}
Again, when accessing the k array via the mongoclient library, the data is NOT corrupted, this bug in the js client could be completely unrelated.
I've attached mr.cpp that reproduces the behaviour.