-
Type: Bug
-
Resolution: Duplicate
-
Priority: Major - P3
-
None
-
Affects Version/s: 2.6.1
-
Component/s: MapReduce
-
None
-
ALL
-
There is an issue when using map reduce with sharded output using merge mode.
If there is more than one chunk in the output collection and some of the map reduce values have a key already stored in the result collection, the map reduce fails stating:
"exception: insertDocument :: caused by :: 11000 E11000 duplicate key error index"
At first I thought it might be because I was using the same collection as input and as output. But it also happens when using different collections.
This doesn't happen if the output collection is unsharded or if it only has one chunk.
The map reduce was executed through mongo and also through pymongo with the same behavior.
This bug might not happen the first time you execute a map reduce on the collection with already stored keys. But after several executions that make the output collection grow and get divided into more chunks the bug shows up.
I haven't tried what happens when the input collection is not sharded.
- duplicates
-
SERVER-7926 Map Reduce with sharded output can apply reduce on duplicate documents if a migration happened
- Closed
- is related to
-
SERVER-15024 mapReduce output to sharded collection leaves orphans and then uses them in subsequent map/reduce
- Closed