[SERVER-70697] [CQF] Eliminate PathArr, given Traverse PathArr Created: 19/Oct/22  Updated: 02/Feb/24

Status: In Progress
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Improvement Priority: Major - P3
Reporter: David Percy Assignee: Backlog - Query Optimization
Resolution: Unresolved Votes: 0
Labels: Ideas, bonsai-onboarding, optimization
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
is related to SERVER-75067 [CQF] Support simplifications related... Open
Assigned Teams:
Query Optimization
Sprint: QO 2024-02-05, QO 2024-02-19
Participants:

 Description   

If we have a predicate like (ComposeM PathArr (Traverse PathArr)), we should simplify it to just (Traverse PathArr). Traverse means "is or contains", so if a value "is-or-contains an array" then it must be an array. We can remove PathArr because it's implied by (Traverse PathArr).

This will allow more predicates to be covered by an index.

For example, here is a query that would benefit:

db.c.find({a: {$elemMatch: {$elemMatch: {$eq: 2}}}})

In other words: 'a' is an array, which contains an array, which contains 2.

It becomes this ABT:

Get a
ComposeM
|   PathArr
Traverse
ComposeM
|   PathArr
Traverse
PathCompare Eq 2

After relaxing the predicates we get this Sargable node:

Get a Identity -> [ [], BinData )   # 'a' is an array
Get a Traverse Identity -> [ [], BinData )  # 'a' is-or-contains an array
Get a Traverse Traverse Identity -> [ 2, 2 ]  # 'a' is-or-contains something which is-or-contains 2

A multikey index on {a: 1} can cover the second and third predicates, but not the first. But the second predicate implies the first, so we should just drop the first predicate, or mark it perfOnly.

Index path: Get "a" Traverse id.



 Comments   
Comment by David Percy [ 27/Jun/23 ]

It looks like in simplifyPartialSchemaReqPaths we already check for cases where one predicate subsumes another, so that seems like a good place to do this simplification. We sort the predicates by their path, so predicates with similar paths are next to each other, and we only need to compare each predicate against the previous one.

Comment by David Percy [ 30/May/23 ]

I think we want to do this simplification on the PartialSchemaRequirements representation, not on the ABT / path representation, because PartialSchemaRequirements are conveniently flattened and ordered already.

It may make sense to add this to simplifyPartialSchemaReqPaths because that function already checks whether one PartialSchemaRequirement subsumes another.

Generated at Thu Feb 08 06:16:52 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.