Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-12615

Hitting Max Document size upon Map/Reduce?

    • Type: Icon: Bug Bug
    • Resolution: Done
    • Priority: Icon: Major - P3 Major - P3
    • 2.6.0-rc2
    • Affects Version/s: 2.5.5
    • Component/s: MapReduce
    • None
    • Environment:
      CentOS 6.5 x86_64 Sharded
    • Fully Compatible
    • None
    • 0
    • None
    • None
    • None
    • None
    • None
    • None

      I'm running an incremental map reduce over a large data set in a sharded environment. The output is set to "reduce".

      This strategy works fine until the batch I'm running the MR on exceeds ~14Million documents. At which point the MR will fail with error code 10334 saying:

      "MR Parallel Processing failed. errmsg: 'Exception: BSONObj size 16951756 ... is invalid ..."

      I was under the impression that there are no size concerns when it comes to map/reduce. (Assuming of course your document size doesn't exceed 16MB. All of the documents I'm dealing with are ~400bytes).

      It doesn't fail immediately and I suspect it's the cumulative data from one shard or another that is exceeding this size. I wasn't aware that this was an issue with MR in sharded environments.

      Any ideas what's going on?

            Assignee:
            mathias@mongodb.com Mathias Stearn
            Reporter:
            bchivari Brad C.
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

              Created:
              Updated:
              Resolved: