[SERVER-76470] Classic $lookup produces incorrect results when localField has 0-prefixed numeric component Created: 24/Apr/23 Updated: 29/Nov/23 Resolved: 01/May/23 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | 7.1.0-rc0 |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Matt Boros | Assignee: | Matt Boros |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||
| Assigned Teams: |
Query Execution
|
||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||
| Operating System: | ALL | ||||||||
| Backport Requested: |
v7.0, v6.3, v6.0, v5.0, v4.4
|
||||||||
| Sprint: | QO 2023-05-15 | ||||||||
| Participants: | |||||||||
| Linked BF Score: | 105 | ||||||||
| Description |
|
Repro:
This is likely because SBE treats localField as a regular $match ($match treats "00"-type fields as field names, not indexes). Classic $lookup treats "00"-type fields as both an index and a field name. I believe this is unintentional, and likely should be fixed. |
| Comments |
| Comment by Matt Boros [ 01/May/23 ] | ||||||||||||||||||
|
This will need to be backported to 4.4: https://github.com/10gen/mongo/blob/v4.4/src/mongo/db/pipeline/document_path_support.cpp#L83 | ||||||||||||||||||
| Comment by Githook User [ 01/May/23 ] | ||||||||||||||||||
|
Author: {'name': 'Matt Boros', 'email': 'matt.boros@mongodb.com', 'username': 'mattBoros'}Message: | ||||||||||||||||||
| Comment by Matt Boros [ 25/Apr/23 ] | ||||||||||||||||||
|
Sounds good, that makes sense. This would also be an issue for $graphLookup connectTo and connectFrom fields, since they also use the document_path_support::visitAllValuesAtPath helper. kateryna.kamenieva@mongodb.com what do you think of this solution? I believe this conflicts with our previous conclusion not to change the classic behavior. | ||||||||||||||||||
| Comment by Bernard Gorman [ 25/Apr/23 ] | ||||||||||||||||||
|
matt.boros@mongodb.com: yes, we should fix the behaviour in Classic - specifically, in the document_path_support helper linked above. The ticket should be re-titled something like "Classic $lookup incorrectly interprets numeric field names with leading zeroes as array indices." I presume we'll have to backport this to previous versions as well. Looks like this bug may have been around for some time. | ||||||||||||||||||
| Comment by Matt Boros [ 25/Apr/23 ] | ||||||||||||||||||
|
Bernard, do you suggest changing the classic behavior to match SBE then? If so, I can take this ticket. I have a lot of context on it already (and a potential solution), as well as tests written for this behavior. | ||||||||||||||||||
| Comment by Bernard Gorman [ 25/Apr/23 ] | ||||||||||||||||||
|
I believe this is incorrect behaviour in Classic; the SBE behaviour is correct. A numeric path which is prefixed by a leading 0 is never considered an array index, always a field name:
We have specific code in FieldRef, for instance, to distinguish a numeric path which is 0-prefixed from one which isn’t. But it appears that the localField path extraction in DocumentSourceLookup is disobeying these rules:
... which parses the string "00" into a size_t array index without checking for leading zeros. So this appears to be a bug in the document_path_support library. | ||||||||||||||||||
| Comment by Rushan Chen [ 25/Apr/23 ] | ||||||||||||||||||
|
We are unable to ascertain whether these are valid usecases, or accidental ones: numeric field path or array index. Inclined not to add "consistency" over accidental features if this is the case here. xiaochen.wu@mongodb.com kateryna.kamenieva@mongodb.com please chime in. | ||||||||||||||||||
| Comment by Matt Boros [ 24/Apr/23 ] | ||||||||||||||||||
|
Although these behaviors are different and could affect some customer workloads, this is a good opportunity to make our language more consistent with itself. The $lookup localField-foreignField join should behave like a $match between the two fields, but in classic it doesn't act this way. Instead of making SBE code more confusing and making the language more confusing for customers and future engineers who have to understand the behaviors, we can choose to accept this, or resolve it some other way. |