[SERVER-745] Embedded Document Expansion (Pseudo-JOINs) Created: 14/Mar/10 Updated: 17/Jul/15 Resolved: 24/Jun/15 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Usability |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | New Feature | Priority: | Major - P3 |
| Reporter: | Michael Bleigh | Assignee: | Unassigned |
| Resolution: | Duplicate | Votes: | 18 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||||||||||||||
| Participants: | |||||||||||||||||||||||||
| Description |
|
Taking a bit of inspiration from CouchDB (http://blog.couch.io/post/446015664/whats-new-in-apache-couchdb-0-11-part-two-views) it would be nice for MongoDB to support JOIN-esque operations by using a special "document expander" query command. It is common practice in MongoDB to store partial documents as references, including the DocID and some other attributes useful for displaying that document in the context of the parent document. For instance if I have a document with a "users" key it might contain several users that look like: { "_id":DocID("ab2372a4a24"), "name":"User Name", "photo":"http://profile.photo.url"}Presumably in these cases the embedded document could be considered a "fragment" of the document referenced in the ID. It would, therefore, be extremely useful to be able to automatically "expand" that fragment into the full document when necessary via the query syntax. Let's say that a "posts" collection has documents with a "users" key containing many fragments as described above. If I perform a query that looks like this: db.posts.find().join( {"users":"users"}) Then MongoDB would automatically replace each of the embedded documents in the "users" key with the full document specified in the "users" collection (the document's key being the key in the join argument, the collection being the value). Additionally, passing a "1" value instead would denote that the values contained in the _id field of the given key are DB References that should be dereferenced individually. Note that this should work in the following cases:
|
| Comments |
| Comment by Ian Whalen (Inactive) [ 24/Jun/15 ] |
|
Hi Michael, thanks a lot for filing this feature request and my apologies for the time since it was last updated. I'm going to close this as a Duplicate and link it to our upcoming $lookup feature - after careful consideration we’ve decided to provide users with the desired functionality via our aggregation pipeline instead of our regular query system. Please follow along in |
| Comment by Tom Wardrop [ 11/Jun/11 ] |
|
@jp "The point where this might really shine, and I have yet to have a full grasp of the internals so not sure of the possibility, would be if the references where some day in the future resolved on the server side. If there were more sophisticated and not just data used to query, rather pointers to addresses of the actual data, then I could see a huge boost here!" I believe this is what the reporter was referring to, where MongoDB handles the resolution of DBRef's server-side. I like the idea a lot myself. It's only a matter of time until MongoDB gets some kind of server-side document joining capability, but the only question is which method or methods will be used to achieve these joins. This idea is good because it proposes that DBRef's be optionally resolved on the server, while at the same time providing a mechanism to reduce the frequency of the joins by allowing partials to be stored alongside the DBRef's (kind of as a fallback). I'd like to see this idea merged with As long as the performance is decent, I think the merger of those ideas would satisfy the majority of use cases for joins in Mongo. |
| Comment by jp [ 07/May/11 ] |
|
I do see this as a potential nicety, but scary when you consider storing your references in an array and having to retrieve them all. If that bug ( I do say nicety because it makes life easier in some cases, while mainly you can deal with the subsequent query on the client side just as easy. The same index search needs to take place as well as the same amount of data across the network. PRO: From a performance standpoint it could reduce some number of round trips which is always good when you are squeezing the most you can. The point where this might really shine, and I have yet to have a full grasp of the internals so not sure of the possibility, would be if the references where some day in the future resolved on the server side. If there were more sophisticated and not just data used to query, rather pointers to addresses of the actual data, then I could see a huge boost here! |