[SERVER-42137] Allow aggregation $merge stage to write to a collection that the query also reads from Created: 10/Jul/19 Updated: 29/Oct/23 Resolved: 20/Nov/19 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Aggregation Framework |
| Affects Version/s: | 4.2.0-rc2 |
| Fix Version/s: | 4.3.2 |
| Type: | Improvement | Priority: | Major - P3 |
| Reporter: | Clare Scally | Assignee: | Mihai Andrei |
| Resolution: | Fixed | Votes: | 1 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||||||||||||||||||
| Backwards Compatibility: | Minor Change | ||||||||||||||||||||||||||||
| Sprint: | Query 2019-11-18, Query 2019-12-02 | ||||||||||||||||||||||||||||
| Participants: | |||||||||||||||||||||||||||||
| Case: | (copied to CRM) | ||||||||||||||||||||||||||||
| Description |
|
Enable functionality to permit $out to merge/append to the aggregation source collection. |
| Comments |
| Comment by Githook User [ 20/Nov/19 ] | ||||
|
Author: {'name': 'Mihai Andrei', 'email': 'mihai.andrei@mongodb.com'}Message: | ||||
| Comment by Guy Harrison [ 16/Jul/19 ] | ||||
|
Yes, we duplicate multiple documents but only for a specific key value The prototoype pipeline we used in 4.1 went like this:
4.1 allowed this behavior with mode:'insertDocuments'. | ||||
| Comment by Asya Kamsky [ 15/Jul/19 ] | ||||
|
guy@southbanksoftware.com I'm trying to understand the use case - you say you are modifying the documents into the same collection, but yet you're using insert mode, so when you say you are cloning the document, does this mean when you start with ten documents, you expect to end up with twenty? How do you generate a new unique "merge on" field? | ||||
| Comment by Guy Harrison [ 14/Jul/19 ] | ||||
|
We have a use case in which we are cloning a set of documents into the same collection (with some modifications). We were enthusiastically awaiting the enhancements to $out that appeared in 4.1 which allowed us to do this cloning on the server side without a network round trip. It was disappointing to see when 4.2 arrived that the capability which existed in 4.1 had been removed. I do understand how in MongoDB there is a chance of an infinite loop if the $merge stage generates documents that might be read by the $match clause if there is one or even worse if there is no $match. In our use case, the $match clause specifically selects documents that cannot match those output by the pipeline - there's a $project and $addfields that modifies the output to prevent that from occurring. The functionality is essentially INSERT INTO t1 SELECT * FROM t1. In a relational database, a query snapshot would prevent the SELECT from reading anything created by the INSERT. So, we would really like to have this capability. Given what I understand of MongoDB architecture, it would seem that the easiest way to do this would be to append the output to a temporary collection until all earlier stages of the pipeline are complete and then append that output to the source collection.
|