[SERVER-43943] $lookup with array local field misbehaves in pipeline Created: 10/Oct/19 Updated: 27/Oct/23 Resolved: 22/Oct/19 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Aggregation Framework |
| Affects Version/s: | 4.0.4 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Adam Rackis | Assignee: | David Storch |
| Resolution: | Works as Designed | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Operating System: | ALL | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Steps To Reproduce: |
Then observe that
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Sprint: | Query 2019-11-04 | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Participants: |
| Comments |
| Comment by Adam Rackis [ 23/Oct/19 ] | ||||||||||
|
Wow - that's a stupendous explanation. Thank you so much! | ||||||||||
| Comment by David Storch [ 23/Oct/19 ] | ||||||||||
|
adamrackis@hotmail.com, there are two different $in expressions which appear in different context in the grammar of the MongoDB Query Language. Although these two expressions have the same spelling, they are conceptually distinct, and are not expected to have the same syntax or semantics. Your example above uses the $in match expression: see https://docs.mongodb.com/manual/reference/operator/query/in/. It's syntax is
Its meaning is that the document matches if "field" is equal to val_1 OR val_2 OR ... OR val_n. The notion of equality here involves the expansion of arrays. So if field contains an array with any of the values 1 through n, then the document also matches. In contrast, the query given in "Steps to Reproduce" uses an $in inside the $expr construct, which means that you're getting the $in agg expression: see https://docs.mongodb.com/manual/reference/operator/aggregation/in/. The syntax and semantics of the $in agg expression are slightly different than that of the match expression. Syntactically, it looks like this:
Here, <exp2> must evaluate to an array. The expression returns a boolean which indicates whether the value returned by <exp1> is an element of the array returned by <exp2>. For example,
You'll notice that in this context, there is no implicit expansion of arrays – implicit array traversal behavior typically applies only to match expressions, not to agg expressions. Does that help to clarify? | ||||||||||
| Comment by Adam Rackis [ 23/Oct/19 ] | ||||||||||
|
@asya - thanks a ton for the info. The thing I'm not quite understanding, though, is why this query
returns this document
It seems when $in is used in this way, the array of values we're checking for is flattened out, and each checked individually. But when I use $in inside of $expr, like in the original code, I see the different behavior you describe. Could I trouble you to help me understand what exactly the difference is?
| ||||||||||
| Comment by David Storch [ 22/Oct/19 ] | ||||||||||
|
adamrackis@hotmail.com, per Asya's comment above, I am closing this ticket as "Works as Designed". Let me know if you have any further questions! | ||||||||||
| Comment by Asya Kamsky [ 21/Oct/19 ] | ||||||||||
|
> It seems arrays are not represented correctly in the pipeline stage of a $lookup adamrackis@hotmail.com, the pipeline is working correctly in this case. Here is what happens: Inside $expr the aggregation semantics apply and the $in expression checks if the first argument ("$$name" which is an array in your first case) is present as an element in the second argument (an array). The expression correctly evaluates to false and therefore you get an empty join set. If you wanted to check if the two arrays have any common elements in aggregation, you can use $setIntersection and compare the result to empty array:
If you were translating these examples from simple $lookup with localField/foreignField, note that regular find/$match semantics are used to compare local field and foreign field, which means local field is either equal to foreign field or is equal to an array element of foreign field. In addition, in this case localField as an array is treated as separate elements for the purpose of checking this equality, rather than as an array literal, which may be adding to some confusion here. | ||||||||||
| Comment by Adam Rackis [ 11/Oct/19 ] | ||||||||||
|
Awesome, thanks! And thanks for cleaning up the formatting on my ticket. Sorry about that! | ||||||||||
| Comment by Carl Champain (Inactive) [ 11/Oct/19 ] | ||||||||||
|
Thanks for the report. Kind regards, | ||||||||||
| Comment by Adam Rackis [ 10/Oct/19 ] | ||||||||||
|
Apologies for the crappy formatting. I can't seem to find an edit button to try to fix :-| |