[SERVER-42738] Slow $lookup on $expr match with null field Created: 09/Aug/19 Updated: 06/Dec/22 Resolved: 19/Aug/19 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Index Maintenance |
| Affects Version/s: | 4.0.12 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Ben Rotz | Assignee: | Backlog - Query Team (Inactive) |
| Resolution: | Duplicate | Votes: | 0 |
| Labels: | query-44-grooming | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||||||||||||||||||||||||||||||
| Assigned Teams: |
Query
|
||||||||||||||||||||||||||||||||||||||||
| Operating System: | ALL | ||||||||||||||||||||||||||||||||||||||||
| Steps To Reproduce: | Since the problem is with $lookup, and it is difficult (impossible?) to see the performance of $lookup in explain, we need to create enough records in a collection to see the problem firsthand:
Now, let's create a primary collection, where a few records have a reference:
Now, let's run a $lookup:
This lookup appears to run just fine. By way of information, my understanding is that the lookup happening here is the equivalent of db.human.find({indexField: 'name1'});
Now, let's run a $lookup using pipeline:
This lookup runs slow. If you were to put a $match in the first stage and filter records by {refId: {$exists: true}}, it would run fast. Which leads me to believe the problem is with the records for which refId does not exist. By way of information, my understanding is that the lookup happening here is the equivalent of db.human.find({$expr: {$eq: ['$indexField', 'name1']}});
Interestingly enough, the query
Seems to run just fine, so I'm not sure what's happening in the $lookup to make it run so slowly. Slowly here is obviously relative. I'm on a 2017 high-end macbook pro, and slowly means about 1.5 seconds. |
||||||||||||||||||||||||||||||||||||||||
| Participants: | |||||||||||||||||||||||||||||||||||||||||
| Description |
|
When doing a $lookup, if the primary collection has null/non-existing values for localField, and is using a pipeline, the $lookup appears to be doing full table scans for the records with null values.
|
| Comments |
| Comment by Craig Homa [ 19/Aug/19 ] |
|
Resolving this as a duplicate of SERVER-40362. |
| Comment by Danny Hatcher (Inactive) [ 16/Aug/19 ] |
|
Thanks for your report. I'll pass this on to the Query team to determine whether this is expected or not. |