[SERVER-32540] Make partial index subset analysis consider $elemMatch object a subset of $exists Created: 04/Jan/18 Updated: 06/Dec/22 |
|
| Status: | Backlog |
| Project: | Core Server |
| Component/s: | Querying |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Improvement | Priority: | Major - P3 |
| Reporter: | Alex Hu | Assignee: | Backlog - Query Optimization |
| Resolution: | Unresolved | Votes: | 3 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Environment: |
mongod --version cat /etc/redhat-release |
||
| Issue Links: |
|
||||||||||||||||||||
| Assigned Teams: |
Query Optimization
|
||||||||||||||||||||
| Participants: | |||||||||||||||||||||
| Case: | (copied to CRM) | ||||||||||||||||||||
| Description |
|
The query explainer don't consider the keys in $elemMatch expression when to decide index boundaries. Instead, it scans all entries in the selected index. |
| Comments |
| Comment by David Storch [ 22/Feb/18 ] | ||
|
Hi huyingming, Thanks for this issue report. The root cause of the behavior you're seeing is that the planner does not consider the query predicate eligible to use the partial index. I can reproduce this behavior as described, but only when the index has a partialFilterExpression. To provide some background: early in the query planning process, the planner undergoes an index selection phase in which it identifies indexes that are relevant to the query. For partial indexes, this involves proving, without running the query, that the predicate is guaranteed to match a subset of the documents which match the partialFilterExpression. This is necessary for correctness, since the partial index must have keys for all matching documents in order for the query plan to be correct. The necessary subset analysis is implemented by expression_algo::isSubsetOf. isSubsetOf behaves conservatively: if it cannot prove that a subset relationship exists, it returns false, and the partial index will not be considered relevant to the query. This can happen either because 1) the code to prove the subset relationship for this particular predicate type has not been implemented, or 2) it is impossible to prove that the subset relationship exists without inspecting the data. You have stumbled upon an instance of the former scenario. I'm fairly certain that
is guaranteed to match a subset of the documents matching
However, this optimization has not yet been implemented. This is expected, since not all known optimizations of this variety have been implemented (see, e.g., SERVER-17853). I will leave this ticket open as an improvement request, and retitle it to "Make partial index subset analysis consider $elemMatch object a subset of $exists". I am also going to direct this ticket to the Query Team for triage. It seems likely that we will pursue this work item only when we are making other partial index planning improvements, such as SERVER-17853 or One more thing: the [MinKey, MaxKey] index bounds you mention in your original report of this issue are a consequence of SERVER-26413. The system currently allows a user to hint a partial index that is not eligible to answer the query. In general, this can cause the planner to produce incorrect plans. In your case, the plan happens to be correct, but is inefficient. Best, | ||
| Comment by Alex Hu [ 10/Jan/18 ] | ||
|
One comment: This issue is regardless with the partial filter from the index spec. That means the issue is the same even if remove the partialFilterExpression from the index spec. |