Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-13981

Temporary map/reduce collections are incorrectly replicated to secondaries

    • Type: Icon: Bug Bug
    • Resolution: Done
    • Priority: Icon: Critical - P2 Critical - P2
    • 2.6.2, 2.7.1
    • Affects Version/s: 2.5.5, 2.6.0, 2.6.1
    • Component/s: MapReduce
    • Labels:
      None
    • Fully Compatible
    • ALL

      Issue Status as of May 27, 2014

      ISSUE SUMMARY
      With the introduction of 2.6, certain temporary map/reduce collections have incorrectly been replicated to secondary nodes. This adds additional traffic between replica set nodes. Additionally, these collections do not have an _id value in their documents, which causes scanning of collections during replication on the primary and can impact performance.

      USER IMPACT
      Large map/reduce jobs with millions of documents can noticeably impact the performance of the server, increase oplog churn and thus network traffic between replica set members.

      WORKAROUNDS
      There is no workaround for replicating inserts to the temporary collections. If the impact to the server increases to intolerable levels, the m/r job should be moved to a dedicated hidden secondary node to mitigate the issue.

      AFFECTED VERSIONS
      MongoDB 2.6.0 and 2.6.1 are affected by this issue.

      FIX VERSION
      The fix is included in the 2.6.2 production release.

      RESOLUTION DETAILS
      Documents in temporary *_inc collections are explicitly not replicated. This restores the behavior prior to development version 2.5.5.

      Original Description

      Run the map reduce example on a 2.6 replica set.

      From the oplog, I can see that mapReduce generates tmp collections <database.tmp.mr.collection_x_inc> without _id field. This would cause performance issue when it tried to replicate these tmp collections on the secondaries.

      > db.oplog.rs.find({ns:/_inc/})
      { "ts" : Timestamp(1400390715, 1), "h" : NumberLong("9062785211345050513"), "v" : 2, "op" : "i", "ns" : "test.tmp.mr.docs_0_inc", "o" : { "0" : 36, "1" : 256193 } }
      { "ts" : Timestamp(1400390715, 2), "h" : NumberLong("-6347065931779322235"), "v" : 2, "op" : "i", "ns" : "test.tmp.mr.docs_0_inc", "o" : { "0" : 1, "1" : 237298 } }
      { "ts" : Timestamp(1400390715, 3), "h" : NumberLong("5305159503718125362"), "v" : 2, "op" : "i", "ns" : "test.tmp.mr.docs_0_inc", "o" : { "0" : 2, "1" : 247543 } }
      { "ts" : Timestamp(1400390715, 4), "h" : NumberLong("242292647194800186"), "v" : 2, "op" : "i", "ns" : "test.tmp.mr.docs_0_inc", "o" : { "0" : 3, "1" : 246875 } }
      { "ts" : Timestamp(1400390715, 5), "h" : NumberLong("-3801567793714329373"), "v" : 2, "op" : "i", "ns" : "test.tmp.mr.docs_0_inc", "o" : { "0" : 4, "1" : 250808 } }
      { "ts" : Timestamp(1400390715, 6), "h" : NumberLong("-7467661728084668641"), "v" : 2, "op" : "i", "ns" : "test.tmp.mr.docs_0_inc", "o" : { "0" : 5, "1" : 266786 } }
      { "ts" : Timestamp(1400390715, 7), "h" : NumberLong("48771082501147428"), "v" : 2, "op" : "i", "ns" : "test.tmp.mr.docs_0_inc", "o" : { "0" : 6, "1" : 239294 } }
      { "ts" : Timestamp(1400390715, 8), "h" : NumberLong("7947765402550217396"), "v" : 2, "op" : "i", "ns" : "test.tmp.mr.docs_0_inc", "o" : { "0" : 7, "1" : 246862 } }
      

            Assignee:
            mathias@mongodb.com Mathias Stearn
            Reporter:
            linda.qin@mongodb.com Linda Qin
            Votes:
            0 Vote for this issue
            Watchers:
            14 Start watching this issue

              Created:
              Updated:
              Resolved: