-
Type:
Task
-
Resolution: Unresolved
-
Priority:
Major - P3
-
None
-
Affects Version/s: None
-
Component/s: None
-
Query Execution
-
None
-
None
-
None
-
None
-
None
-
None
-
None
In MongoDB, sparse indexes omit documents where the indexed field is missing. This characteristic creates an issue when using them in SBE $lookup queries, because $lookup is designed to match documents where the field is missing with those where the field's value is null.
Consider two collections, a and b, with documents:
- Collection a: [ { _id: 1, a: 'apple' }, { _id: 2 }, { _id: 3, a: null } ]
- Collection b: [ { _id: 10, a: 'apple', details: 'red fruit' }, { _id: 11, a: null, details: 'unknown value' }, { _id: 12, details: \"missing 'a' field entirely\" } ]
The $lookup query is:
db.a.aggregate([{
$lookup:{
from: "b",
localField: "a",
foreignField: "a",
as: "matchedDocs"
}
}])
Expected Result:
[
{ _id: 1, a: 'apple',
matchedDocs: [ { _id: 10, a: 'apple', details: 'red fruit' }] },
{ _id: 2,
matchedDocs: [
{ _id: 11, a: null, details: 'unknown value' },
{ _id: 12, details: \"missing 'a' field entirely\" }
]},
{ _id: 3, a: null,
matchedDocs: [
{ _id: 11, a: null, details: 'unknown value' },
{ _id: 12, details: \"missing 'a' field entirely\" }
]}
]
Note that the documents from a with a missing field (_id: 2) or a null field (_id: 3) correctly match both the null document (_id: 11) and the missing field document (_id: 12) in b.
If collection b has a sparse index on field a and the SBE engine uses it, the result is incorrect:
[
{ _id: 1, a: 'apple',
matchedDocs: [
{ _id: 10, a: 'apple', details: 'red fruit' }
]},
{ _id: 2,
matchedDocs: [
{ _id: 11, a: null, details: 'unknown value' }
]},
{ _id: 3, a: null,
matchedDocs: [
{ _id: 11, a: null, details: 'unknown value' }
]}
]
The document with _id: 12 from collection b is missing from the results. This is because the sparse index does not contain this document, which is problematic since the expected behaviour of $lookup is to match missing fields.
Proposed Solution for SBE:
The Classic engine avoids this issue because it re-plans the lookup for each local field value. The inability of SBE to correctly handle this scenario with sparse indexes represents a regression for MongoDB users (SERVER-64082).
To resolve this and allow SBE to use sparse indexes, we should adopt a strategy similar to how SBE handles index collation incompatibility (SERVER-81555):
- If the local field exists and its value is NOT null: Use the sparse index.
- If the local field value IS null or null-ish (i.e., the field is missing): Perform a collection scan, as the sparse index cannot guarantee completeness for these values.
- is related to
-
SERVER-64082 Consider usage of wildcard index for $lookup in SBE
-
- Closed
-
-
SERVER-81555 $lookup on foreign field uses index in classic but not in SBE when collations do not match
-
- Closed
-
-
SERVER-116154 Do not push $lookup to SBE if there is an eligible sparse index
-
- Backlog
-