[SERVER-56105] [SBE] $split expression succeeds with SBE off but fails with SBE on Created: 14/Apr/21  Updated: 29/Oct/23  Resolved: 16/Apr/21

Status: Closed
Project: Core Server
Component/s: Query Execution
Affects Version/s: None
Fix Version/s: 5.0.0-rc0

Type: Bug Priority: Major - P3
Reporter: David Storch Assignee: Andrii Dobroshynski (Inactive)
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Backwards Compatibility: Fully Compatible
Operating System: ALL
Steps To Reproduce:

I've created a short repro script to demonstrate the problem:

(function() {
 
function createCollectionAndRunProblemQuery(conn) {
    const testDb = conn.getDB("testdb");
    const coll = testDb.test_collection;
    coll.drop();
    assert.commandWorked(coll.insert({}));
 
    printjson(coll.explain("queryPlanner").aggregate([
        {$project: {_id: 0, out: {$split: ["$missing", {$toLower: "$missing"}]}}}
    ]));
    return coll
        .aggregate([{$project: {_id: 0, out: {$split: ["$missing", {$toLower: "$missing"}]}}}])
        .toArray()[0];
};
 
let conn = MongoRunner.runMongod();
assert.neq(null, conn, "failed to start mongod");
const withoutSbe = createCollectionAndRunProblemQuery(conn);
MongoRunner.stopMongod(conn);
 
conn = MongoRunner.runMongod({setParameter: "featureFlagSBE=true"});
assert.neq(null, conn, "failed to start mongod");
const withSbe = createCollectionAndRunProblemQuery(conn);
MongoRunner.stopMongod(conn);
 
assert.eq(withoutSbe, withSbe);
}());

Sprint: Query Execution 2021-05-03
Participants:

 Description   

When the collection contains a single document (which has no field named "missing"), the following query succeeds with SBE off but fails when SBE is on:

coll.aggregate([{$project: {_id: 0, out: {$split: ["$missing", {$toLower: "$missing"}]}}}])

I believe the problem relates to different evaluation orders. The classic engine first checks whether each of $split's arguments are nullish, and if so returns null. This means that the $split expression returns null due to to its first argument evaluating to null, prior to doing any validation checks for the second argument. SBE, on the other hand, uses a plan which appears to validate the second argument before the first. The $toLower expression returns an empty string, which fails since this is an illegal value as the second parameter to $split. Here is the SBE plan, with a few edits I made to the spacing to make it more legible:

[2] traverse s10 s9 s2 {} {}
from
    [1] scan s2 s3 [] @\"09a5b5d4-ba95-401e-bbd8-076034ee9c93\" true
in
    [2] mkbson s9 s2 [] keep [out = s8] true false
    [2] project [s8 = let [l2.0 = s5, l2.1 = let [l1.0 = s7]
        if (! exists (l1.0) || typeMatch (l1.0, 0x00000440),
            \"\",
            if (typeMatch (l1.0, 0x000F4206),
                toLower (coerceToString (l1.0)),
                fail ( 5066300 ,$toLower input type is not supported)))]
        if (! exists (l2.1) || typeMatch (l2.1, 0x00000440),
            null,
            if (! isString (l2.1),
                fail ( 5155400 ,$split delimiter must be a string),
                if (l2.1 == \"\",
                    fail ( 5155401 ,$split delimiter must not be an empty string),
                    if (! exists (l2.0) || typeMatch (l2.0, 0x00000440),
                        null,
                        if (! isString (l2.0),
                            fail ( 5155402 ,$split string expression must be a string),
                            if (l2.0 == \"\",
                                [\"\"],
                                split (l2.0, l2.1)))))))]
    [2] traverse s7 s7 s6 [s4, s5] {} {}
    from
        [2] project [s6 = getField (s2, \"missing\")]
        [2] traverse s5 s5 s4 {} {}
        from
            [2] project [s4 = getField (s2, \"missing\")]
            [2] limit 1
            [2] coscan
        in
            [2] project [s5 = s4]
            [2] limit 1
            [2] coscan
 
    in
        [2] project [s7 = s6]
        [2] limit 1
        [2] coscan



 Comments   
Comment by Githook User [ 16/Apr/21 ]

Author:

{'name': 'Andrii Dobroshynski', 'email': 'andrii.dobroshynski@mongodb.com', 'username': 'dobroshynski'}

Message: SERVER-56105 Make $split expression behave the same in SBE and classic engine
Branch: master
https://github.com/mongodb/mongo/commit/76415dd2b639315d80f49e42cf5788ac36b10a32

Generated at Thu Feb 08 05:38:21 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.