[SERVER-46904] Efficiency of one $or branch during planning may not be representative of overall plan efficiency Created: 16/Mar/20  Updated: 18/May/23

Status: Backlog
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Improvement Priority: Major - P3
Reporter: Chris Harris Assignee: Backlog - Query Optimization
Resolution: Unresolved Votes: 0
Labels: bonsai
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
Related
is related to SERVER-20616 Plan ranker sampling from the beginni... Backlog
Assigned Teams:
Query Optimization
Participants:
Case:

 Description   

Related to SERVER-20616, but specific to plans which perform more than one index scan and not reliant on data skew/affinity.  

Consider the following scenario:

  • A contained $or query shape of { <Common Predicates> , $or: [ { <CLAUSE1> }

    , { <CLAUSE2> } ] }

  • Predicate values for CLAUSE1 such that there are at least 101 matching documents
  • Indexes A and B
  • CLAUSE1 (along with the pushed down Common Predicates) is perfectly served by index A (meaning a key scanned to doc returned ratio of 1).  Similarly CLAUSE 2 plus Common Predicates is perfectly served by index B

The optimizer will produce, at least, the following four plans:

  1. index A, index A
  2. index A, index B
  3. index B, index A
  4. index B, index B

Given the parameters above, we would expect the trial phase to terminate after 101 works/index keys are scanned.  Assuming all plans test CLAUSE1 first (SERVER-42090), this means that plans 1 and 2 will score the same since they are evaluating the same index and spend the entire trial phase in the clause.  The overall efficiency of the two plans will be very different though since a different index is used for CLAUSE2.  


Generated at Thu Feb 08 05:12:46 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.