Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-43860

Pipeline style update in $merge can produce unexpected result

    • Type: Icon: Improvement Improvement
    • Resolution: Fixed
    • Priority: Icon: Major - P3 Major - P3
    • 4.2.2, 4.3.2
    • Affects Version/s: 4.2.0
    • Component/s: Aggregation Framework
    • Labels:
      None
    • Major Change
    • v4.2
    • Query 2019-11-04, Query 2019-11-18
    • 50

      When a $merge stage with a custom pipeline cannot match a document in the target collection, it will insert a new document created by running the pipeline on an empty document. For example,

      db.monthlytotals.drop()
      db.votes.insertOne(
         { date: new Date("2019-05-07"), "thumbsup" : 14, "thumbsdown" : 10 }
      )
      db.votes.aggregate([
         { $match: { date: { $gte: new Date("2019-05-07"), $lt: new Date("2019-05-08") } } },
         { $project: { _id: { $dateToString: { format: "%Y-%m", date: "$date" } }, thumbsup: 1, thumbsdown: 1 } },
         { $merge: {
               into: "monthlytotals",
               on: "_id",
               whenMatched:  [
                  { $addFields: {
                      thumbsup: { $add:[ "$thumbsup", "$$new.thumbsup" ] },
                      thumbsdown: { $add: [ "$thumbsdown", "$$new.thumbsdown" ] }
                  } } ],
               whenNotMatched: "insert"
         } }
      ])
      printjson(db.monthlytotals.find().toArray())
      [ { "_id" : "2019-05", "thumbsup" : null, "thumbsdown" : null } ]
      

      Here, we execute an upsert with a custom pipeline. For pipeline updates, if we don’t match any documents, we generate a new document to insert by running the pipeline with an empty input document (and, in the case of $merge, the original document as $$new). In the example above, that means we’re doing this:
       
      thumbsup: { $add:[ MISSING, 14 ] }
      thumbsdown: { $add:[ MISSING, 10 ] }
       
      But the semantics of the $add expression are such that anything added to null or missing produces null.
      This could be confusing to the users as one might expect that the inserted document would be the one that it produced by the $project stage, e.g., { "_id" : "2019-05", "thumbsup" : 14, "thumbsdown" : 10 }.

      This is also inconsistent with other whenMatched modes. E.g., with 'whenMatched: replace, whenMatched: insert', we'd insert the document { "_id" : "2019-05", "thumbsup" : 14, "thumbsdown" : 10 }.

      It may also be confusing that we're executing a pipeline defined in the whenMatched branch, when we fall under the whenNotMatched branch.

      We should consider different options to see if user experience can be improved. This could be a simple solution to update our documentation to clearly describe the existing behaviour, or just the semantics of pipeline style updated with $merge (for example, but inserting the original document accessed via $$new when there is no match).

            Assignee:
            bernard.gorman@mongodb.com Bernard Gorman
            Reporter:
            anton.korshunov@mongodb.com Anton Korshunov
            Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

              Created:
              Updated:
              Resolved: