[SERVER-28717] $lookup does not expand 'localField' as expected when it is a dotted path representing fields within an array Created: 10/Apr/17  Updated: 29/Jul/19  Resolved: 19/May/17

Status: Closed
Project: Core Server
Component/s: Aggregation Framework
Affects Version/s: None
Fix Version/s: 3.4.6, 3.5.8

Type: Bug Priority: Major - P3
Reporter: Asya Kamsky Assignee: James Wahlin
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Backports
Duplicate
is duplicated by SERVER-28896 $lookup should work right with array ... Closed
Related
is related to SERVER-42306 $lookup with an array of sub objects ... Backlog
Backwards Compatibility: Fully Compatible
Operating System: ALL
Backport Requested:
v3.4
Sprint: Query 2017-05-29
Participants:

 Description   

db.a.find()
{ "_id" : ObjectId("58ebbf82016ce019673b111c"), "a" : [ { "_id" : 1 }, { "_id" : 2 } ] }
{ "_id" : ObjectId("58ebbf8d016ce019673b111d"), "a" : [ { "_id" : 5 }, { "_id" : 12 } ] }
> db.b.find()
{ "_id" : 1 }
{ "_id" : 2 }
{ "_id" : 3 }
{ "_id" : 5 }
{ "_id" : 12 }

I would expect $lookup using "a._id" as localField to work since SERVER-22881 was fixed, however:

db.a.aggregate({$lookup:{from:"b", localField:"a._id", foreignField:"_id", as:"details"}})
{ "_id" : ObjectId("58ebbf82016ce019673b111c"), "a" : [ { "_id" : 1 }, { "_id" : 2 } ], "details" : [ ] }
{ "_id" : ObjectId("58ebbf8d016ce019673b111d"), "a" : [ { "_id" : 5 }, { "_id" : 12 } ], "details" : [ ] }

"details" is properly populated only if I unwind "a" and use "a._id" or if I create another field first in $addFields or $project from "$a._id" (which makes it a simple array) then it works as localField:

db.a.aggregate({$addFields:{a:"$a._id"}},{$lookup:{from:"b", localField:"a", foreignField:"_id", as:"details"}})
{ "_id" : ObjectId("58ebbf82016ce019673b111c"), "a" : [ 1, 2 ], "details" : [ { "_id" : 1 }, { "_id" : 2 } ] }
{ "_id" : ObjectId("58ebbf8d016ce019673b111d"), "a" : [ 5, 12 ], "details" : [ { "_id" : 5 }, { "_id" : 12 } ] }



 Comments   
Comment by David Storch [ 14/Jul/17 ]

Hi neillunn, yep, I totally agree that this would be a good enhancement to the query language, and that there is currently no easy way to express this kind of join operation. I think it's unlikely that we would address your request by altering the semantics of the $lookup stage, though admittedly we have yet to take this through the design approval process so I don't want to try and predict the outcome. By best guess is that instead we would introduce either a new $lookup option or a new aggregation stage which works as you expect with arrays.

So new issue or not, there should at least be a solid note on "this issue" that it does not in fact address the common case of usage, but rather a somewhat "contrived" case that in practicality is not really used by anyone, nor asked for as a solution.

Do you think that there are any extensions or clarifications we could make to the $lookup documentation in order to help clarify the current behavior?

Best,
Dave

Comment by Neil Lunn [ 13/Jul/17 ]

I'm happy to file as a new issue and nearly almost did just that with the basic content here. I raised it on this issue because on three occasions now, MongoDB staff have pointed to this issue "incorrectly" in reference to the behavior I point out in the example as the intended pattern by questions raised on StackOverflow.

So aside from "incorrectly" stating that this issue close "resolves" that behavior, the actual example on this issue seems rather contrived, and not really reflective of what the actual intended usage is with objects in an array, which is of course to "merge" rather than produce a "new" array of output for the joined property.

Of course the "patch" could be used with other operations to achieve such a "merge" ( $map and $indexOfArray come to mind ), but I think the general community expectation is a "merge", just like what would happen with the $unwind form as is demonstrated.

So new issue or not, there should at least be a solid note on "this issue" that it does not in fact address the common case of usage, but rather a somewhat "contrived" case that in practicality is not really used by anyone, nor asked for as a solution.

Neil

Comment by David Storch [ 10/Jul/17 ]

Hi neillunn, your example is working as currently designed, but you make a very interesting point about missing functionality. It would be nice to have a way to easily express a join operation where each subdocument in an array is joined with the matching documents in a foreign collection. I think this should be tracked as a feature request in a new SERVER ticket. Would you like to file one? I'd also be happy to file it on your behalf.

Best,
Dave

Comment by Neil Lunn [ 10/Jul/17 ]

I cannot see how this is actually fully resolved since there are still obvious problems if the output path is given for an existing path within the source array.

db.a.insert([
  { "a": [{ "_id": 1, "prop": 2 },{ "_id": 2, "prop": 3 }] }
])
  
db.b.insert([
   { "_id": 1 },
   { "_id": 2 },
   { "_id": 3 }
])

And then using $lookup with the same "localField" and "as" output paths:

 
db.a.aggregate([
  { "$lookup": {
    "from": "b",
    "localField": "a._id",
    "foreignField": "_id",
    "as": "a._id"  
  }}
])

The output is :

{
    "_id" : ObjectId("5962dd75f5f11516540d118d"),
    "a" : {
        "_id" : [ 
            {
                "_id" : 1.0
            }, 
            {
                "_id" : 2.0
            }
        ]
    }
}

As opposed to treating with $unwind and rebuilding the array with $group:

db.a.aggregate([
  { "$unwind": "$a" },
  { "$lookup": {
    "from": "b",
    "localField": "a._id",
    "foreignField": "_id",
    "as": "a._id"  
  }},
  { "$unwind": "$a._id" },
  { "$group": {
    "_id": "$_id",
    "a": { "$push": "$a" }  
  }}
])

Which seems more representative of the expected "join".

{
    "_id" : ObjectId("5962dd75f5f11516540d118d"),
    "a" : [ 
        {
            "_id" : {
                "_id" : 1.0
            },
            "prop" : 2.0
        }, 
        {
            "_id" : {
                "_id" : 2.0
            },
            "prop" : 3.0
        }
    ]
}

It's better than what it was before, which was no output. But It does not really meet the common case of "joining with the content existing in an array of objects".

New issue perhaps?

Comment by Githook User [ 14/Jun/17 ]

Author:

{u'username': u'jameswahlin', u'name': u'James Wahlin', u'email': u'james.wahlin@10gen.com'}

Message: SERVER-28717 Expand $lookup localField when a sub-obj field in an array
Includes:

Comment by Githook User [ 19/May/17 ]

Author:

{u'username': u'jameswahlin', u'name': u'James Wahlin', u'email': u'james.wahlin@10gen.com'}

Message: SERVER-28717 Expand $lookup localField when a sub-obj field in an array
Branch: master
https://github.com/mongodb/mongo/commit/e56f02ec6de6def5e66d0744a620515e2a82708e

Generated at Thu Feb 08 04:18:52 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.