Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-15334

Map/Reduce jobs stall when database is taking writes

    • Type: Icon: Bug Bug
    • Resolution: Done
    • Priority: Icon: Critical - P2 Critical - P2
    • None
    • Affects Version/s: 2.4.10
    • Component/s: MapReduce
    • ALL
    • Hide

      Appears to be:

      Start longish running indexed map/reduce job(s)
      Delete all records while the job is still running

      Show
      Appears to be: Start longish running indexed map/reduce job(s) Delete all records while the job is still running

      Because of a race condition in my code, I can delete a set of docs over which I am in the middle of running a map/reduce job (precise details below, I don't believe they are relevant). The map/reduce query is indexed.

      My expectation is that the map/reduce job would just "ignore" records that were deleted.

      Instead something odder seems to happen - the jobs last for way longer than they should and we see performance degradation

      For example, here's a db.currentOp:

                      {
                              "opid" : "replica_set1:406504931",
                              "active" : true,
                              "secs_running" : 560,
                              "op" : "query",
                              "ns" : "doc_metadata.metadata",
                              "query" : {
                                      "$msg" : "query not recording (too large)"
                              },
                              "client_s" : "10.10.90.42:41453",
                              "desc" : "conn881377",
                              "threadId" : "0x6790e940",
                              "connectionId" : 881377,
                              "locks" : {
                                      "^doc_metadata" : "R"
                              },
                              "waitingForLock" : true,
                              "msg" : "m/r: (1/3) emit phase M/R: (1/3) Emit Progress: 2899/1 289900%",
                              "progress" : {
                                      "done" : 2899,
                                      "total" : 1
                              },
      

      Check out the Emit Progress...

      There were a few of these, they all ran for 20 minutes or so (the number of docs being deleted was small - in the few thousand range), before eventually cleaning themselves up.

      Bonus worry: I have a similar case in which I run a map/reduce over the entire collection (several 10s of millions of documents), to which documents
      are continually being added or removed - should I worry, or is this an edge case that happens when a high % of the query set is removed....

      (Details:
      Thread 1: 1a) update a bunch of docs to have field:DELETE_ME
      Thread 1: 2a) run a map/reduce job to count some of their attributes prior to deletion
      Thread 2: 1b) update a bunch more docs to have field:DELETE_ME
      Thread 2: 2b) run a map/reduce job to count some of their attributes prior to deletion
      Thread 1: 3a) Remove all docs with field:DELETE ME
      Thread 2: 3b) Remove all docs with field:DELETE ME
      )

            Assignee:
            ramon.fernandez@mongodb.com Ramon Fernandez Marina
            Reporter:
            apiggott@ikanow.com Alex Piggott
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Created:
              Updated:
              Resolved: