[SERVER-30145] compound multikey index doesn't use tight bounds for trailing field in spite of $elemMatch Created: 14/Jul/17 Updated: 06/Dec/22 |
|
| Status: | Backlog |
| Project: | Core Server |
| Component/s: | Querying |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Improvement | Priority: | Major - P3 |
| Reporter: | Asya Kamsky | Assignee: | Backlog - Query Optimization |
| Resolution: | Unresolved | Votes: | 4 |
| Labels: | query-44-grooming, storch | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||
| Assigned Teams: |
Query Optimization
|
||||
| Participants: | |||||
| Case: | (copied to CRM) | ||||
| Description |
|
Sample data:
And index on "a1.id":1, "a1.a2.s":1 I would expect the query below to use tight bounds on "a1.id" and on "a1.a2.s" but it uses Minkey,Maxkey on "a1.a2.s":
If the data is structured with a scalar in a2 (or any other way such that the query on the indexed field is of form {{$elemMatch:{$gt: , $lt }}} then the tight bounds are used. |
| Comments |
| Comment by Asya Kamsky [ 08/Jun/18 ] |
|
This issue has not been fixed - once it is scheduled and resolved, only then do we continue if it's feasible to backport the fix to older versions. |
| Comment by Rajasekar Ramamurthy [ 08/Jun/18 ] |
|
We are looking out for a fix in 3.6 release version of this bug |
| Comment by Tess Avitabile (Inactive) [ 28/Jul/17 ] |
|
We should be able to compound these bounds because although a1.id and a1.a2.s share a multikey path prefix, the predicates are joined by an $elemMatch on that path prefix. The problem is indeed that we only keep track of one elemMatchExpr per predicate, that of the innermost $elemMatch containing the predicate. This means that we only allow two predicates that share a multikey path prefix to be compounded if their innermost $elemMatch is the same and is inside the multikey path prefix. Instead, we should allow two predicates that share a multikey path prefix to be compounded if their outermost $elemMatch that is inside the multikey path prefix is the same. Equivalently, if they have any common $elemMatch inside the multikey path prefix. We could fix this by storing every $elemMatch expression containing a predicate. Then for the first predicate we index, we could compute the $elemMatch expressions containing it that are inside of the multikey path prefix here (or just the outermost one). When we attempt to compound with a second predicate, we could compare the stored $elemMatch(s) against the set of $elemMatch expressions containing the second predicate here. |
| Comment by Ian Whalen (Inactive) [ 28/Jul/17 ] |
|
We think this has to do with only keeping track of the innermost elemmatch context for each predicate. Tess to dig further with Asya to figure out what's causing this. |