[SERVER-82982] [CQF] jstests/cqf/optimizer/sort_merge.js fails with parameterization enabled Created: 08/Nov/23  Updated: 24/Jan/24

Status: Open
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Ben Shteinfeld Assignee: Backlog - Query Optimization
Resolution: Unresolved Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
depends on SERVER-83512 Run CQF tests with parameterization Closed
Assigned Teams:
Query Optimization
Operating System: ALL
Sprint: QO 2023-11-13, QO 2023-11-27
Participants:

 Description   

This test fails when parameterization is enabled with a tassert that the optimizer fails to generate a plan.

The query is 

find({a: {$in: [1, 2, 3]}}).sort({_id: 1}).hint({a: 1})

The predicate is translated as

PathGet [a]
PathTraverse [1]
PathCompare [EqMember]
FunctionCall [getParam]
|  Const [15] // type of array
Const [0] // parameter ID

When we try to convert this ABT to PartialSchemaRequirements, we fail because we only support building bounds for EqMember when the operand is a Constant.

This results in us failing to convert this filter to a SargableNode, which is necessary for the optimizer to ever emit an ixscan plan. With the hint, we disable collscan plans, so we fail to generate a plan.



 Comments   
Comment by Ben Shteinfeld [ 10/Nov/23 ]

If we are willing to parameterize each element of the $in operand separately, I believe the existing code would handle this case without changes. The disadvantage of this is potentially a large number of parameters for a query.

Another approach would be to parameterize the entire operand of $in and generate the PSR using the getElement SBE function for each index of the array. For example, {$in: [1, 2]} would be parameterized as a single parameter (getParam(0)). The intervals we could generate are [getElement(getParam(0), 0), getElement(getParam(0), 0)] U [getElement(getParam(0), 1), getElement(getParam(0), 1)].

Comment by Ben Shteinfeld [ 09/Nov/23 ]

We are currently unable to generate an ixscan plan for a $in with a parameterized operand. The example above just shows a test failure where the hint expects us to generate an ixscan plan and we fail to do so.

Looking at how SBE stage builders handle this case. Consider a query find({a: {$in: [0, 10, 20, 30]}}),

$$RESULT=s23 env: { s1 = Nothing (nothing), s3 = Nothing, s7 = {\"a\" : 1}, s11 = [{\"l\" : KS(290104), \"h\" : KS(29FE04)}, {\"l\" : KS(2B140104), \"h\" : KS(2B14FE04)}, {\"l\" : KS(2B280104), \"h\" : KS(2B28FE04)}, {\"l\" : KS(2B3C0104), \"h\" : KS(2B3CFE04)}], s22 = false }
 
[2] nlj inner [] [s17, s18, s19, s20, s21]
    left
        [1] branch {s22} [s17, s18, s19, s20, s21]
        [s2, s4, s5, s6, s7] [1] ixscan_generic s3 s6 s2 s4 s5 lowPriority [] @"3fd854cc-d3b7-4f17-8afa-c17e19cfd488" @"a_1" true
        [s8, s14, s15, s16, s7] [1] nlj inner [] [s9, s10]
            left
                [1] project [s9 = getField(s12, "l"), s10 = getField(s12, "h")]
                [1] unwind s12 s13 s11 false
                [1] limit 1
                [1] coscan
            right
                [1] ixseek s9 s10 s16 s8 s14 s15 [] @"3fd854cc-d3b7-4f17-8afa-c17e19cfd488" @"a_1" true
    right
        [2] limit 1
        [2] seek s17 s23 s24 s18 s19 s20 s21 none none [] @"3fd854cc-d3b7-4f17-8afa-c17e19cfd488" true false 

 The SBE stage builders use IETs to generate a set of index bounds that are bound into s11 in the runtime environment (global slots). These bounds are exposed to the plan using unwind, project on the outer side of an NLJ, where the inner side is an ixscan.

We need to have a larger design discussion about we expect parameterized ixscan plans to work in Bonsai. The decisions we make there will largely inform how we choose to handle this specific case with $in. I don't know whether we will handle this as part of this project, or a follow up which deals specifically with plan cache for M4.

I think in the short term, we should disable $in parameterization or only perform it when there are no indexes (M2 eligible queries).

Generated at Thu Feb 08 06:50:55 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.