-
Type:
Bug
-
Resolution: Unresolved
-
Priority:
Major - P3
-
None
-
Affects Version/s: None
-
Component/s: None
-
None
-
Query Execution
-
ALL
-
None
-
None
-
None
-
None
-
None
-
None
-
None
In a standard $lookup operation involving a local and a foreign collection, index usage can be specified for the foreign collection using query settings.
Consider an example with two collections, mainColl and secondaryColl, both having indexes {a:1} and {a:1, b:1}. The following pipeline is applied to mainColl:
[
{
"$match": { "a": 1, "b": 5 }
},
{
"$lookup":
{
"from": "secondaryColl",
localField": "a",
"foreignField": "a",
"as": "output"
}
}
]
Case 1: Specifying an index for the local collection (mainColl)
If the query settings are set as follows:
"settings": { "indexHints": { "ns": { "db": "test", "coll": "mainColl" }, "allowedIndexes": [ { "a": 1, "b": 1 } ] } }
The index {a:1, b:1} on mainColl will be used for the $match filter stage.
Case 2: Specifying an index for the foreign collection (secondaryColl)
If the query settings are set as follows:
"settings": { "indexHints": { "ns": { "db": "test", "coll": "secondaryColl" }, "allowedIndexes": [ { "a": 1, "b": 1 } ] } }
The index {a:1, b:1} on secondaryColl will be used when executing the $lookup stage.
Issue in Self-Lookups
A self-lookup occurs when the foreign collection is the same as the local collection, as shown below:
[
{
"$match": { "a": 1, "b": 5 }
},
{
"$lookup":
{
"from": "mainColl",
localField": "a",
"foreignField": "a",
"as": "output"
}
}
]
If the query settings specify an index for mainColl:
"settings": { "indexHints": { "ns": { "db": "test", "coll": "mainColl" }, "allowedIndexes": [ { "a": 1, "b": 1 } ] } }
One would expect the index {a:1, b:1} to be used for both the $match and the $lookup stages. However, this is not currently the case. While the index {a:1, b:1} is correctly used for the filter stage, the index used for the $lookup stage is the {a:1}.
Solution : To ensure the specified index hint is respected in self-lookups, the query settings index hints for the main collection should be applied when computing index information for the secondary collections.
Reproduction : buildscripts/resmoke.py run --suites=replica_sets_jscore_passthrough --runAllFeatureFlagTests qs-repro.js