[SERVER-32540] Make partial index subset analysis consider $elemMatch object a subset of $exists Created: 04/Jan/18  Updated: 06/Dec/22

Status: Backlog
Project: Core Server
Component/s: Querying
Affects Version/s: None
Fix Version/s: None

Type: Improvement Priority: Major - P3
Reporter: Alex Hu Assignee: Backlog - Query Optimization
Resolution: Unresolved Votes: 3
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

mongod --version
db version v3.4.7
git version: cf38c1b8a0a8dca4a11737581beafef4fe120bcd
OpenSSL version: OpenSSL 1.0.1e-fips 11 Feb 2013
allocator: tcmalloc
modules: none
build environment:
distmod: rhel70
distarch: x86_64
target_arch: x86_64

cat /etc/redhat-release
CentOS Linux release 7.0.1406 (Core)


Issue Links:
Duplicate
is duplicated by SERVER-27522 query optimizer does not choose appro... Closed
Related
is related to SERVER-26413 Hinting an incompatible partial index... Backlog
is related to SERVER-17853 Allow more complex expressions to be ... Backlog
Assigned Teams:
Query Optimization
Participants:
Case:

 Description   

The query explainer don't consider the keys in $elemMatch expression when to decide index boundaries. Instead, it scans all entries in the selected index.



 Comments   
Comment by David Storch [ 22/Feb/18 ]

Hi huyingming,

Thanks for this issue report. The root cause of the behavior you're seeing is that the planner does not consider the query predicate eligible to use the partial index. I can reproduce this behavior as described, but only when the index has a partialFilterExpression.

To provide some background: early in the query planning process, the planner undergoes an index selection phase in which it identifies indexes that are relevant to the query. For partial indexes, this involves proving, without running the query, that the predicate is guaranteed to match a subset of the documents which match the partialFilterExpression. This is necessary for correctness, since the partial index must have keys for all matching documents in order for the query plan to be correct. The necessary subset analysis is implemented by expression_algo::isSubsetOf.

isSubsetOf behaves conservatively: if it cannot prove that a subset relationship exists, it returns false, and the partial index will not be considered relevant to the query. This can happen either because 1) the code to prove the subset relationship for this particular predicate type has not been implemented, or 2) it is impossible to prove that the subset relationship exists without inspecting the data. You have stumbled upon an instance of the former scenario. I'm fairly certain that

var query = {a: {$elemMatch: {b: 3}};

is guaranteed to match a subset of the documents matching

var partialFilterExpression = {"a.b": {$exists: true}};

However, this optimization has not yet been implemented. This is expected, since not all known optimizations of this variety have been implemented (see, e.g., SERVER-17853). I will leave this ticket open as an improvement request, and retitle it to "Make partial index subset analysis consider $elemMatch object a subset of $exists". I am also going to direct this ticket to the Query Team for triage. It seems likely that we will pursue this work item only when we are making other partial index planning improvements, such as SERVER-17853 or SERVER-18884.

One more thing: the [MinKey, MaxKey] index bounds you mention in your original report of this issue are a consequence of SERVER-26413. The system currently allows a user to hint a partial index that is not eligible to answer the query. In general, this can cause the planner to produce incorrect plans. In your case, the plan happens to be correct, but is inefficient.

Best,
Dave

Comment by Alex Hu [ 10/Jan/18 ]

One comment: This issue is regardless with the partial filter from the index spec. That means the issue is the same even if remove the partialFilterExpression from the index spec.

Generated at Thu Feb 08 04:30:32 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.