[SERVER-81477] $lookup against sparse index Created: 26/Sep/23  Updated: 08/Nov/23  Resolved: 08/Nov/23

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Ben Rotz Assignee: Alison Rhea Thorne
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Operating System: ALL
Participants:

 Description   

In Mongodb 7.0, I started getting extraordinarily long query times for some aggregate queries with $lookup. Namely, if the aggregate query was looking up against a collection using a field that had a sparse index:

document.student.createIndex(

{'teacher_id': 1}

, {
partialFilterExpression: {'teacher_id': {
$exists: true,
}},
});

db.teacher.aggregate([{$lookup: {from: 'student', let:

{teacherId: '$_id'}

, pipeline: [{'$match': {'$expr': {$eq: ['$teacher_id', '$$teacherId']}}}], as: 'students'}}]);

This is somewhat related to, but different from the following:
https://www.mongodb.com/community/forums/t/conditional-lookup/3833/10
https://jira.mongodb.org/browse/SERVER-44749



 Comments   
Comment by Alison Rhea Thorne [ 08/Nov/23 ]

We haven’t heard back from you for some time, so I’m going to close this ticket. If this is still an issue for you, please provide additional information and we will reopen the ticket.

Comment by Alison Rhea Thorne [ 23/Oct/23 ]

We still need additional information to diagnose the problem. If this is still an issue for you, would you please provide the information requested?

Comment by Alison Rhea Thorne [ 06/Oct/23 ]

Hello ben@ethika.com,

Thank you for your report. Just to clarify, it seems your report was filed in regards to sparse indexes, however the index that you have present in your reproduction is a partial index. Regardless, I did have a few questions in regards to your report to better assist with triaging this issue:

  • Does SERVER-40362 also match what you are seeing?
  • Can you clarify how significant the degredation you've observed was in comparison to before the upgrade?
  • How large is the dataset that you are running this against?
  • Can you provide a randomly generated example of said data to better our attempts at replication?

We'll also need FTDC and logs if you can provide them. I've created a secure upload portal for you. Files uploaded to this portal are hosted on Box, are visible only to MongoDB employees, and are routinely deleted after some time.

For each node in the replica set spanning a time period that includes the incident (before and after your upgrade to 7.0.0), would you please archive (tar or zip) and upload to that link:

  • the mongod logs
  • the $dbpath/diagnostic.data directory (the contents are described here)
Generated at Thu Feb 08 06:46:36 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.