Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-3529

Sharded map reduce using merge stalls recreating indexes on the output collection.

    • Type: Icon: Bug Bug
    • Resolution: Done
    • Priority: Icon: Major - P3 Major - P3
    • 1.8.4
    • Affects Version/s: 1.8.2
    • Component/s: MapReduce, Sharding
    • Labels:
    • Environment:
      Linux x86_64
    • Linux

      Steps to reproduce:

      • Create a sharded cluster of two replica sets.
      • Shard a database called 'delicious'.
      • Shard a collection called 'links' sharded on 'author'
      • Load the data from this link into the sharded collection:
        http://www.infochimps.com/link_frame?dataset=13364
      • Run the attached python script twice.

      Results:

      The script will complete on the first run no problems. It will get stuck on the second run when the server is trying to recreate indexes on the output collection. The python script will eventually fail with the following assertion:

      failed: final reduce failed:

      { result: "results", assertion: "getMore: cursor didn't exist on server, possible restart or timeout?", assertionCode: 13127, errmsg: "db assertion failure", ok: 0.0 }

      Before the script fails db.currentOp() will show the following with the seconds climbing:

      {
      "opid" : "repl0:1216193",
      "active" : true,
      "lockType" : "write",
      "waitingForLock" : false,
      "secs_running" : 336,
      "op" : "query",
      "ns" : "delicious.results",
      "query" :

      { "$msg" : "query not recording (too large)" }

      ,
      "client_s" : "127.0.0.1:36127",
      "desc" : "conn",
      "msg" : "index: (3/3) btree-middle"
      },

      Here's the sharding info:

      > db.printShardingStatus()
      — Sharding Status —
      sharding version:

      { "_id" : 1, "version" : 3 }

      shards:

      { "_id" : "repl0", "host" : "repl0/behackett-dt:29017" } { "_id" : "repl1", "host" : "repl1/behackett-dt:29020" }

      databases:

      { "_id" : "admin", "partitioned" : false, "primary" : "config" } { "_id" : "delicious", "partitioned" : true, "primary" : "repl0" }

      delicious.links chunks:
      repl1 12
      repl0 13
      too many chunks to print, use verbose if you want to force print

        1. mr_test.py
          1.0 kB
        2. SERVER-3529.diff
          1 kB

            Assignee:
            antoine Antoine Girbal
            Reporter:
            bernie@mongodb.com Bernie Hackett
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Created:
              Updated:
              Resolved: