[SERVER-3529] Sharded map reduce using merge stalls recreating indexes on the output collection. Created: 03/Aug/11  Updated: 12/Jul/16  Resolved: 30/Aug/11

Status: Closed
Project: Core Server
Component/s: MapReduce, Sharding
Affects Version/s: 1.8.2
Fix Version/s: 1.8.4

Type: Bug Priority: Major - P3
Reporter: Bernie Hackett Assignee: Antoine Girbal
Resolution: Done Votes: 0
Labels: mapreduce,
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Linux x86_64


Attachments: File SERVER-3529.diff     File mr_test.py    
Operating System: Linux
Participants:

 Description   

Steps to reproduce:

  • Create a sharded cluster of two replica sets.
  • Shard a database called 'delicious'.
  • Shard a collection called 'links' sharded on 'author'
  • Load the data from this link into the sharded collection:
    http://www.infochimps.com/link_frame?dataset=13364
  • Run the attached python script twice.

Results:

The script will complete on the first run no problems. It will get stuck on the second run when the server is trying to recreate indexes on the output collection. The python script will eventually fail with the following assertion:

failed: final reduce failed:

{ result: "results", assertion: "getMore: cursor didn't exist on server, possible restart or timeout?", assertionCode: 13127, errmsg: "db assertion failure", ok: 0.0 }

Before the script fails db.currentOp() will show the following with the seconds climbing:

{
"opid" : "repl0:1216193",
"active" : true,
"lockType" : "write",
"waitingForLock" : false,
"secs_running" : 336,
"op" : "query",
"ns" : "delicious.results",
"query" :

{ "$msg" : "query not recording (too large)" }

,
"client_s" : "127.0.0.1:36127",
"desc" : "conn",
"msg" : "index: (3/3) btree-middle"
},

Here's the sharding info:

> db.printShardingStatus()
— Sharding Status —
sharding version:

{ "_id" : 1, "version" : 3 }

shards:

{ "_id" : "repl0", "host" : "repl0/behackett-dt:29017" } { "_id" : "repl1", "host" : "repl1/behackett-dt:29020" }

databases:

{ "_id" : "admin", "partitioned" : false, "primary" : "config" } { "_id" : "delicious", "partitioned" : true, "primary" : "repl0" }

delicious.links chunks:
repl1 12
repl0 13
too many chunks to print, use verbose if you want to force print



 Comments   
Comment by auto [ 30/Aug/11 ]

Author:

{u'login': u'agirbal', u'name': u'agirbal', u'email': u'antoine@10gen.com'}

Message: SERVER-3529: Sharded map reduce using merge stalls recreating indexes on the output collection
Branch: v1.8
https://github.com/mongodb/mongo/commit/f3bd113e0df642703fda8cc9fe7f6cdf6503e5e8

Comment by Eliot Horowitz (Inactive) [ 30/Aug/11 ]

Ok - so can you apply patch on 1.8

Comment by Antoine Girbal [ 30/Aug/11 ]

issue does not exist in 2.0 line due to refactoring.
This fix is only for 1.8.

Comment by Eliot Horowitz (Inactive) [ 30/Aug/11 ]

Can you confirm is this is an issue in 2.0 or only 1.8?

Comment by Antoine Girbal [ 17/Aug/11 ]

diff that fixes 1.8

Comment by Antoine Girbal [ 15/Aug/11 ]

I have to confirm but I think this issue is only in 1.8.
There is a bug in 1.8 where the index created is "0" instead of "_id" on output collection.
This obviously makes it painful for MERGE or REDUCE.

Comment by Bernie Hackett [ 03/Aug/11 ]

Probably obvious but you need pymongo to run the script.

Generated at Thu Feb 08 03:03:19 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.