[SERVER-71871] [CQF] Satisfy 'Traverse ComposeM' predicates with IndexScan + Filter Created: 05/Dec/22  Updated: 16/Oct/23

Status: Investigating
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Improvement Priority: Major - P3
Reporter: David Percy Assignee: Unassigned
Resolution: Unresolved Votes: 0
Labels: Ideas, optimization
Σ Remaining Estimate: Not Specified Remaining Estimate: Not Specified
Σ Time Spent: Not Specified Time Spent: Not Specified
Σ Original Estimate: Not Specified Original Estimate: Not Specified

Sub-Tasks:
Key
Summary
Type
Status
Assignee
SERVER-77733 [CQF] Simplify PathArr paths based on... Sub-task Closed Matt Olma  
Assigned Teams:
Query Optimization
Sprint: QO 2023-05-15, QO 2023-05-29, QO 2023-06-12, QO 2023-06-26, QO 2023-07-10, QO 2023-07-24, QO 2023-08-07, QO 2023-08-21, QO 2023-09-04, QO 2023-09-18, QO 2023-10-02, QO 2023-10-16, QO 2023-10-30
Participants:

 Description   

If you have an elemMatch with a conjunction:

db.c.find({a: {$elemMatch: {x: 1, y: 1}}})

Then we could generate a plan that scans a multikey index on {a: 1}, and apply the {x: 1, y: 1} predicate to each index key. (And then do the "is array" check after fetching.)

Bonsai does not consider this plan, because we relax the predicate to two separate Traverses:

Filter ... Get [a] Traverse ComposeM (Get [x] ...) (Get [y] ...)
->
Sargable
  Get [a] Traverse Get [x] ...  perfOnly
  Get [a] Traverse Get [y] ...  perfOnly

This means we consider plans such as:

  • Full index scan, filter one predicate, fetch, filter on whole predicate
  • Index intersection of {a: 1} with itself
  • Collection scan

(This is still an improvement over classic, which only considers a collection scan. Even with hint({a: 1}), classic does not consider evaluating either predicate on the index side to reduce fetching.)

Maybe this would break down into sub-features:

  • an exploration rewrite that turns Traverse predicates into Unwind
  • a rewrite that matches Unwind with a multikey index


 Comments   
Comment by Matt Olma [ 16/Oct/23 ]

Some notes from working on this ticket: https://docs.google.com/document/d/18L_TcjDa36TEx46zYbNoBx-u2gtdY4SRTkJXPf-JuIQ/edit

Comment by David Percy [ 13/Jul/23 ]

Removing the 'bonsai-onboarding' tag because I don't think this is a good onboarding ticket after all:

  • It's a bit too open-ended, and may involve expand the scope of access path selection to handle Unwind, which is a bigger design decision.
  • The optimization itself is probably less beneficial than I originally thought, since a full index scan + expression + fetch is likely to be more expensive than a collection scan + expression.
Comment by David Percy [ 09/May/23 ]

Just to explain the idea a little more:

This predicate Get [a] Traverse ComposeM (Get [x] Eq 1) (Get [y] Eq 1) wants to check elements of 'a' separately. So it should be possible to use a physical plan like:

...
Unique [rid]
Filter (p1.x = 1) and (p1.y = 1)
IndexScan {a:1}, bind field 'a' as [p1], bind record id as [rid]

This plan doesn't use index intersection, and doesn't need to examine 'x' or 'y' after fetching.

I don't think this works currently because we don't have a way to represent this in the logical plan / exploration phase. We need use Traverse to visit array elements, and then check the {x:1, y:1} predicate on each one.

I'm imagining something like this:

Filter (p1.x = 1) and (p1.y = 1)
Sargable [Index]
    Get [a] Traverse Id   bind as [p1]

but I don't think this is representable, currently. We don't allow combining Traverse and an output binding this way.

Generated at Thu Feb 08 06:20:11 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.