[SERVER-745] Embedded Document Expansion (Pseudo-JOINs) Created: 14/Mar/10  Updated: 17/Jul/15  Resolved: 24/Jun/15

Status: Closed
Project: Core Server
Component/s: Usability
Affects Version/s: None
Fix Version/s: None

Type: New Feature Priority: Major - P3
Reporter: Michael Bleigh Assignee: Unassigned
Resolution: Duplicate Votes: 18
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Duplicate
duplicates SERVER-19095 $lookup Closed
is duplicated by SERVER-19476 Referenced Values Closed
Related
related to SERVER-1632 improve find with using DBRef to supp... Closed
is related to SERVER-432 server side expansion of DBRefs Closed
Participants:

 Description   

Taking a bit of inspiration from CouchDB (http://blog.couch.io/post/446015664/whats-new-in-apache-couchdb-0-11-part-two-views) it would be nice for MongoDB to support JOIN-esque operations by using a special "document expander" query command. It is common practice in MongoDB to store partial documents as references, including the DocID and some other attributes useful for displaying that document in the context of the parent document. For instance if I have a document with a "users" key it might contain several users that look like:

{ "_id":DocID("ab2372a4a24"), "name":"User Name", "photo":"http://profile.photo.url"}

Presumably in these cases the embedded document could be considered a "fragment" of the document referenced in the ID. It would, therefore, be extremely useful to be able to automatically "expand" that fragment into the full document when necessary via the query syntax. Let's say that a "posts" collection has documents with a "users" key containing many fragments as described above. If I perform a query that looks like this:

db.posts.find().join(

{"users":"users"}

)

Then MongoDB would automatically replace each of the embedded documents in the "users" key with the full document specified in the "users" collection (the document's key being the key in the join argument, the collection being the value). Additionally, passing a "1" value instead would denote that the values contained in the _id field of the given key are DB References that should be dereferenced individually.

Note that this should work in the following cases:

  • Specified key is an array of strings (assume each string represents a DocID) or DocIDs
  • Specified key is an array of embedded documents, each with an _id field
  • Specified key is a single string (assume string is a DocID) or DocID
  • Specified key is a DBRef
  • Specified key is an array of embedded documents, each with an _id field that contains a DBRef


 Comments   
Comment by Ian Whalen (Inactive) [ 24/Jun/15 ]

Hi Michael, thanks a lot for filing this feature request and my apologies for the time since it was last updated.

I'm going to close this as a Duplicate and link it to our upcoming $lookup feature - after careful consideration we’ve decided to provide users with the desired functionality via our aggregation pipeline instead of our regular query system. Please follow along in SERVER-19095 for further details of the $lookup implementation.

Comment by Tom Wardrop [ 11/Jun/11 ]

@jp "The point where this might really shine, and I have yet to have a full grasp of the internals so not sure of the possibility, would be if the references where some day in the future resolved on the server side. If there were more sophisticated and not just data used to query, rather pointers to addresses of the actual data, then I could see a huge boost here!"

I believe this is what the reporter was referring to, where MongoDB handles the resolution of DBRef's server-side. I like the idea a lot myself. It's only a matter of time until MongoDB gets some kind of server-side document joining capability, but the only question is which method or methods will be used to achieve these joins.

This idea is good because it proposes that DBRef's be optionally resolved on the server, while at the same time providing a mechanism to reduce the frequency of the joins by allowing partials to be stored alongside the DBRef's (kind of as a fallback). I'd like to see this idea merged with SERVER-1632 to allow querying on DBRef's also.

As long as the performance is decent, I think the merger of those ideas would satisfy the majority of use cases for joins in Mongo.

Comment by jp [ 07/May/11 ]

I do see this as a potential nicety, but scary when you consider storing your references in an array and having to retrieve them all. If that bug (SERVER-142 & SERVER-1797) were resolved then it would be better controlled.

I do say nicety because it makes life easier in some cases, while mainly you can deal with the subsequent query on the client side just as easy. The same index search needs to take place as well as the same amount of data across the network.

PRO: From a performance standpoint it could reduce some number of round trips which is always good when you are squeezing the most you can.

The point where this might really shine, and I have yet to have a full grasp of the internals so not sure of the possibility, would be if the references where some day in the future resolved on the server side. If there were more sophisticated and not just data used to query, rather pointers to addresses of the actual data, then I could see a huge boost here!

Generated at Thu Feb 08 02:55:02 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.