Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-43860

Pipeline style update in $merge can produce unexpected result

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Closed
    • Major - P3
    • Resolution: Fixed
    • 4.2.0
    • 4.2.2, 4.3.2
    • Aggregation Framework
    • None
    • Major Change
    • v4.2
    • Query 2019-11-04, Query 2019-11-18
    • 50

    Description

      When a $merge stage with a custom pipeline cannot match a document in the target collection, it will insert a new document created by running the pipeline on an empty document. For example,

      db.monthlytotals.drop()
      db.votes.insertOne(
         { date: new Date("2019-05-07"), "thumbsup" : 14, "thumbsdown" : 10 }
      )
      db.votes.aggregate([
         { $match: { date: { $gte: new Date("2019-05-07"), $lt: new Date("2019-05-08") } } },
         { $project: { _id: { $dateToString: { format: "%Y-%m", date: "$date" } }, thumbsup: 1, thumbsdown: 1 } },
         { $merge: {
               into: "monthlytotals",
               on: "_id",
               whenMatched:  [
                  { $addFields: {
                      thumbsup: { $add:[ "$thumbsup", "$$new.thumbsup" ] },
                      thumbsdown: { $add: [ "$thumbsdown", "$$new.thumbsdown" ] }
                  } } ],
               whenNotMatched: "insert"
         } }
      ])
      printjson(db.monthlytotals.find().toArray())
      [ { "_id" : "2019-05", "thumbsup" : null, "thumbsdown" : null } ]
      

      Here, we execute an upsert with a custom pipeline. For pipeline updates, if we don’t match any documents, we generate a new document to insert by running the pipeline with an empty input document (and, in the case of $merge, the original document as $$new). In the example above, that means we’re doing this:
       
      thumbsup: { $add:[ MISSING, 14 ] }
      thumbsdown: { $add:[ MISSING, 10 ] }
       
      But the semantics of the $add expression are such that anything added to null or missing produces null.
      This could be confusing to the users as one might expect that the inserted document would be the one that it produced by the $project stage, e.g., { "_id" : "2019-05", "thumbsup" : 14, "thumbsdown" : 10 }.

      This is also inconsistent with other whenMatched modes. E.g., with 'whenMatched: replace, whenMatched: insert', we'd insert the document { "_id" : "2019-05", "thumbsup" : 14, "thumbsdown" : 10 }.

      It may also be confusing that we're executing a pipeline defined in the whenMatched branch, when we fall under the whenNotMatched branch.

      We should consider different options to see if user experience can be improved. This could be a simple solution to update our documentation to clearly describe the existing behaviour, or just the semantics of pipeline style updated with $merge (for example, but inserting the original document accessed via $$new when there is no match).

      Attachments

        Issue Links

          Activity

            People

              bernard.gorman@mongodb.com Bernard Gorman
              anton.korshunov@mongodb.com Anton Korshunov
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: