[SERVER-30320] Feature Request: Merge Created: 25/Jul/17 Updated: 29/Jan/18 Resolved: 26/Jul/17 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Querying |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Improvement | Priority: | Minor - P4 |
| Reporter: | Joshua Austill | Assignee: | Asya Kamsky |
| Resolution: | Duplicate | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||
| Participants: | |||||||||
| Description |
|
If this feature request already exists I couldn't find it, so just point me to that so I can follow it I had to switch from mongodb to mysql at work this week because I couldn't find a way to merge collections. If this exists, I sure couldn't find it. Basically, the request is to be able to merge one collection into another. With logic for matching documents that looks like $lookup, and different updates for on match, on missing on source, and on missing on target. I imagine the syntax looking something like this
The first paramenter being the $lookup criteria that must result in a 1:1 match or no match. The second being the update if matched. The third being the update if the document is missing on the source, in this case delete it on the target. And lastly the not matched on the target specifies an insert. I'm keeping this short in case this request already exists, if it doesn't I can certainly flush it out |
| Comments |
| Comment by Joshua Austill [ 31/Jul/17 ] | |||||||||||||||||||
|
Hey Asya, After reviewing this I believe you are correct, Also, after going through this exercise with you, it hit me that my biggest challenge is that I was approaching the issue wrong. I came from a very long RDBMS background so I was thinking "merge" and not "aggregate". I think my recommendation would be to add documentation that specifically addresses this topic so other people like myself find it right away. Here's the documentation that people like me would want to be able to find. https://docs.microsoft.com/en-us/sql/t-sql/statements/merge-transact-sql It would be neat to see a side by side SQL merge looks like this in MongoDB doc page just like you already have for basic crud operations like find, update, delete, etc. Lastly, for documentation sake, the reason for the delete is pretty straight forward. In this case, my MongoDB collection wouldn't have been the SOR. The SOR would have been a VERY old core retail management system that we have read only access to. So whenever they deleted an SKU in that old system, we'd need to remove it from the MongoDB collection. However, the only way to know it was deleted would be to aggregate both sides and see what was missing in this case because the core system provides no updates, they just go missing. Thanks so much and have a great week! | |||||||||||||||||||
| Comment by Asya Kamsky [ 26/Jul/17 ] | |||||||||||||||||||
|
Thanks for clarification. I'm going to mark this ticket as a duplicate of I'm not sure I understand the purpose of removing the record that doesn't have a match, but if you want those semantics you can already achieve that today:
$mergeObjects is new in 3.5 but you can do the same thing in versions 3.4.4 and later using $objectToArray, $concatArrays, and $arrayToObject expressions.
Thanks for your suggestion and please watch and upvote Asya Kamsky | |||||||||||||||||||
| Comment by Joshua Austill [ 25/Jul/17 ] | |||||||||||||||||||
|
Absolutely! In this example, inventory would be the target, and inventory_temp would be the source. Let's say the inventory collection had two documents like { sku: 1234, name: "phone", description: "a cell phone", status: "active" } { sku: 2345, name: "case", status: "active" }Then the inventory_temp table had two documents like { sku: 1234, name: "phone", status: "removed" } { sku: 3456, name: "case", status: "active" }So after the $lookup sku 1234 would match 1234, but 2345 wouldn't match from the source, and 3456 wouldn't match from the target. The 3 statements would say first update 1234 in the target since it matched the source. Then delete 2346 since it didn't match in the source, and insert 3456 because it didn't match on the target. When you were done the inventory collection would look like this { sku: 1234, name: "phone", description: "a cell phone", status: "removed" } { sku: 3456, name: "case", status: "active" }Merging from another source the delete may have been BAD, so $unset: false would be an ideal default for this. The use cases for this are pretty straight forward. In my case this week, I had millions and millions of documents to update and using a bulk operation I ran out of memory on my client. So I either had to break it up into smaller chunks, or bulk load the source data to a temp table and then update the target from that source. In this case I can do this in mysql in about 10 minutes and my boss knows it, so he made me switch instead of spending the time writing code to break this up into smaller chunks | |||||||||||||||||||
| Comment by Asya Kamsky [ 25/Jul/17 ] | |||||||||||||||||||
|
jlaustill@gmail.com you have multiple collections mentioned: "inventory", "inventory_temp" - what are you calling your source and what's your target? Can you give a small example with data? |