Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-10736

Modify MapReduce to "map, shuffle, reduce", and always take lists on the reducer input

    XMLWordPrintable

Details

    • Improvement
    • Status: Closed
    • Major - P3
    • Resolution: Gone away
    • None
    • None
    • MapReduce
    • None

    Description

      The MapReduce command from MongoDB takes two non-optional functions, "map" and "reduce", and an optional "finalize" function. "reduce" is supposed to output the same data format from the "map" function.

      In some other frameworks, the functions are "map", "shuffle" and "reduce". "shuffle" is the one supposed to output the same data format from "map", just like the "reduce" from mongoDB, but it is "shuffle" that is the optional function, and the non-optional "reduce" is more like the "finalize" from MongoDB. "shuffle" is also known as "local reduce".

      It would be great if MongoDB could work like this instead, with the different nomenclature and optional parameters. Maybe changing the mapReduce method, or maybe creating a new method...

      Another interesting modification is to always deliver the data to the final step ("finalize"/"reduce") inside a list, even if there is just one item. This way we can always assume there is a list to process, and the method becomes simpler to write.

      It should also be easy to have an "identity reducer", it could be the default when no reducer is specified.

      Related tickets:

      Attachments

        Issue Links

          Activity

            People

              backlog-query-execution Backlog - Query Execution
              nwerneck Nicolau Leal Werneck
              Votes:
              1 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: