[DOCS-13226] Investigate changes in SERVER-43860: Pipeline style update in $merge can produce unexpected result Created: 14/Nov/19  Updated: 13/Nov/23  Resolved: 11/Mar/20

Status: Closed
Project: Documentation
Component/s: manual, Server
Affects Version/s: None
Fix Version/s: 4.2.2, 4.3.2, Server_Docs_20231030, Server_Docs_20231106, Server_Docs_20231105, Server_Docs_20231113

Type: Task Priority: Major - P3
Reporter: Backlog - Core Eng Program Management Team Assignee: Jeffrey Allen
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Documented
documents SERVER-43860 Pipeline style update in $merge can p... Closed
Participants:
Days since reply: 3 years, 48 weeks ago
Epic Link: DOCS: 4.4 Server Release Work

 Description   

Description

Downstream Change Summary

Previously, a $merge aggregation with parameters {whenMatched: [ pipeline ], whenNotMatched: "insert"} implemented this behaviour by generating a pipeline-style update command with {upsert:true}. However, this did not correctly capture the semantics of the $merge statement. If no document in the target collection matched the document from the source collection - that is, we hit the {whenNotMatched: "insert"} condition - then the source document was discarded instead of being inserted, and instead an entirely new document was generated by running the 'whenMatched' pipeline over an empty input doc. This is logically incorrect, and inconsistent with the behaviour of {whenNotMatched: "insert"} in all other contexts.

This patch fixes the above $merge mode such that its behaviour matches the expectation; if the source document matches a document in the target collection then the 'whenMatched' pipeline is executed, otherwise we hit the 'whenNotMatched' condition and insert the source document into the target collection as-is.

Depending on the exact nature of the source document and 'whenMatched' pipeline, this may result in a significant change in behaviour from that observed by the user in existing 4.2 versions. Since this is a correctness bug, we will be backporting the fix to 4.2.2 (BACKPORT-5471). Finally, as a side-effect of this change, the $$new variable used to refer to the source document in the 'whenMatched' pipeline is now reserved, and cannot be overridden by the user.

Description of Linked Ticket

When a $merge stage with a custom pipeline cannot match a document in the target collection, it will insert a new document created by running the pipeline on an empty document. For example,

db.monthlytotals.drop()
db.votes.insertOne(
   { date: new Date("2019-05-07"), "thumbsup" : 14, "thumbsdown" : 10 }
)
db.votes.aggregate([
   { $match: { date: { $gte: new Date("2019-05-07"), $lt: new Date("2019-05-08") } } },
   { $project: { _id: { $dateToString: { format: "%Y-%m", date: "$date" } }, thumbsup: 1, thumbsdown: 1 } },
   { $merge: {
         into: "monthlytotals",
         on: "_id",
         whenMatched:  [
            { $addFields: {
                thumbsup: { $add:[ "$thumbsup", "$$new.thumbsup" ] },
                thumbsdown: { $add: [ "$thumbsdown", "$$new.thumbsdown" ] }
            } } ],
         whenNotMatched: "insert"
   } }
])
printjson(db.monthlytotals.find().toArray())
[ { "_id" : "2019-05", "thumbsup" : null, "thumbsdown" : null } ]

Here, we execute an upsert with a custom pipeline. For pipeline updates, if we don’t match any documents, we generate a new document to insert by running the pipeline with an empty input document (and, in the case of $merge, the original document as $$new). In the example above, that means we’re doing this:
 
thumbsup: { $add:[ MISSING, 14 ] }
thumbsdown: { $add:[ MISSING, 10 ] }
 
But the semantics of the $add expression are such that anything added to null or missing produces null.
This could be confusing to the users as one might expect that the inserted document would be the one that it produced by the $project stage, e.g., { "_id" : "2019-05", "thumbsup" : 14, "thumbsdown" : 10 }.

This is also inconsistent with other whenMatched modes. E.g., with 'whenMatched: replace, whenMatched: insert', we'd insert the document { "_id" : "2019-05", "thumbsup" : 14, "thumbsdown" : 10 }.

It may also be confusing that we're executing a pipeline defined in the whenMatched branch, when we fall under the whenNotMatched branch.

We should consider different options to see if user experience can be improved. This could be a simple solution to update our documentation to clearly describe the existing behaviour, or just the semantics of pipeline style updated with $merge (for example, but inserting the original document accessed via $$new when there is no match).

Scope of changes

Impact to Other Docs

MVP (Work and Date)

Resources (Scope or Design Docs, Invision, etc.)



 Comments   
Comment by Githook User [ 11/Mar/20 ]

Author:

{'name': 'jeff-allen-mongo', 'username': 'jeff-allen-mongo', 'email': 'jeffrey.allen@10gen.com'}

Message: (DOCS-13226): merge behavior changes for 4.2
Branch: v4.2
https://github.com/mongodb/docs/commit/7d5ce3690d201c0f123467eba67c2a0438bd70f0

Comment by Githook User [ 11/Mar/20 ]

Author:

{'name': 'jeff-allen-mongo', 'username': 'jeff-allen-mongo', 'email': 'jeffrey.allen@10gen.com'}

Message: (DOCS-13226): merge behavior changes for 4.2
Branch: master
https://github.com/mongodb/docs/commit/f7e4d27cc38ba189857374bcbf9219616ca18efa

Generated at Thu Feb 08 08:07:14 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.