[SERVER-56819] $indexOfCP returns incorrect result when searching for empty string inside empty string with non-zero start index (classic execution engine only) Created: 10/May/21  Updated: 29/Oct/23  Resolved: 28/May/21

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: 4.0.25, 4.2.15, 4.4.7, 5.0.0-rc1, 5.1.0-rc0

Type: Bug Priority: Major - P3
Reporter: David Storch Assignee: David Storch
Resolution: Fixed Votes: 0
Labels: post-rc0, sbe-post-rc0
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Backports
Backwards Compatibility: Minor Change
Operating System: ALL
Backport Requested:
v5.0, v4.9, v4.4, v4.2, v4.0
Sprint: Query Execution 2021-05-31, Query Execution 2021-06-14
Participants:

 Description   

This can be reproduced by starting a standalone mongod with featureFlagSBE and then running the following script:

(function() {
"use strict";
 
const coll = db.c;
coll.drop();
 
assert.commandWorked(coll.insert({str: ""}));
 
const pipeline =
    [{"$project": {"a": {"$indexOfCP": [{$literal: ""}, "$str", NumberLong(1)]}, "_id": 0}}];
 
const resultsWithSbe = coll.aggregate(pipeline).toArray();
 
function setForceClassic(val) {
    assert.commandWorked(db.adminCommand({setParameter: 1, internalQueryForceClassicEngine: val}));
}
 
const resultsWithoutSbe = (function() {
    try {
        setForceClassic(true);
        return coll.aggregate(pipeline).toArray();
    } finally {
        setForceClassic(false);
    }
}());
 
assert.eq(resultsWithSbe, resultsWithoutSbe);
}());

This should fail with a message such as the following:

MongoDB server version: 5.0.0-alpha0
uncaught exception: Error: [[ { "a" : -1 }, { "a" : -1 } ]] != [[ { "a" : 0 }, { "a" : 0 } ]] are not equal :
doassert@src/mongo/shell/assert.js:20:14
assert.eq@src/mongo/shell/assert.js:179:9
@repro.js:30:1
@repro.js:1:2
failed to load: repro.js
exiting with code -3

With SBE on, $indexOfCP is returning -1, presumably to indicate that the needle string is not found at any position in the haystack string. With SBE off, the classic engine returns 0. I haven't yet dug into exactly



 Comments   
Comment by Vivian Ge (Inactive) [ 06/Oct/21 ]

Updating the fixversion since branching activities occurred yesterday. This ticket will be in rc0 when it’s been triggered. For more active release information, please keep an eye on #server-release. Thank you!

Comment by Githook User [ 03/Jun/21 ]

Author:

{'name': 'David Storch', 'email': 'david.storch@mongodb.com', 'username': 'dstorch'}

Message: SERVER-56819 Make indexOfCP's classic engine impl always return -1 when end is less than start

(cherry picked from commit 934d6de41bb3adbecee23e3837852e29c7ff4af6)
(cherry picked from commit 2141bb0d88dc9938fe568a7c4f4584d3c191319a)
(cherry picked from commit 55c75b382453b4fe9ae83b1e73437a20a028d9dd)
(cherry picked from commit 823909a49c1cd29218c2a696516060511b52580d)
Branch: v4.0
https://github.com/mongodb/mongo/commit/038c8e529ff493771af752001093fdba8b566268

Comment by Githook User [ 02/Jun/21 ]

Author:

{'name': 'David Storch', 'email': 'david.storch@mongodb.com', 'username': 'dstorch'}

Message: SERVER-56819 Make indexOfCP's classic engine impl always return -1 when end is less than start

(cherry picked from commit 934d6de41bb3adbecee23e3837852e29c7ff4af6)
(cherry picked from commit 2141bb0d88dc9938fe568a7c4f4584d3c191319a)
(cherry picked from commit 55c75b382453b4fe9ae83b1e73437a20a028d9dd)
Branch: v4.2
https://github.com/mongodb/mongo/commit/823909a49c1cd29218c2a696516060511b52580d

Comment by Githook User [ 02/Jun/21 ]

Author:

{'name': 'David Storch', 'email': 'david.storch@mongodb.com', 'username': 'dstorch'}

Message: SERVER-56819 Make indexOfCP's classic engine impl always return -1 when end is less than start

(cherry picked from commit 934d6de41bb3adbecee23e3837852e29c7ff4af6)
(cherry picked from commit 2141bb0d88dc9938fe568a7c4f4584d3c191319a)
Branch: v4.4
https://github.com/mongodb/mongo/commit/55c75b382453b4fe9ae83b1e73437a20a028d9dd

Comment by Githook User [ 28/May/21 ]

Author:

{'name': 'David Storch', 'email': 'david.storch@mongodb.com', 'username': 'dstorch'}

Message: SERVER-56819 Make indexOfCP's classic engine impl always return -1 when end is less than start

(cherry picked from commit 934d6de41bb3adbecee23e3837852e29c7ff4af6)
Branch: v5.0
https://github.com/mongodb/mongo/commit/2141bb0d88dc9938fe568a7c4f4584d3c191319a

Comment by David Storch [ 28/May/21 ]

The semantics of $indexOfCP have changed slightly with this commit in a corner case. Specifically, consider an expression like the following:

{$indexOfCP: ["$emptyStr", "$emptyStr", 1]}

Here, both of the "$emptyStr" field path expressions are assumed to evaluate to the empty string. Prior to this fix, the expression would return 0. After this fix, the expression will return -1 to indicate that the search string is not found.

The reason for this new behavior is that the starting index of 1 is past the end of the empty string, so we should use -1 to indicate that the search string is not found. I think this is more of a bug fix than a true breaking change, but I marked it as a "minor breaking change" for visibility.

Comment by Githook User [ 28/May/21 ]

Author:

{'name': 'David Storch', 'email': 'david.storch@mongodb.com', 'username': 'dstorch'}

Message: SERVER-56819 Make indexOfCP's classic engine impl always return -1 when end is less than start
Branch: master
https://github.com/mongodb/mongo/commit/934d6de41bb3adbecee23e3837852e29c7ff4af6

Comment by David Storch [ 27/May/21 ]

I'm requesting backports all the way back to version 4.0.

Comment by David Storch [ 10/May/21 ]

In the SBE implementation, we correctly identify that the user's given starting index of 1 is past the end of the string, and return -1 as a result:

https://github.com/mongodb/mongo/blob/bdc6725265d3fdcaf5984ced835cb53c3b3946f8/src/mongo/db/exec/sbe/vm/vm.cpp#L2059-L2062

In the classic engine, on the other hand, we end up returning 0 in this branch:

https://github.com/mongodb/mongo/blob/bdc6725265d3fdcaf5984ced835cb53c3b3946f8/src/mongo/db/pipeline/expression.cpp#L3432-L3436

This is a special case which says that if both the needle string and the haystack string are the empty string (""), then return 0. This check is fine on its own, and indeed there is a similar check in the SBE code. The problem is that the classic engine will early return with a value of 0 before ever checking whether the start index is past the end of the string. In other words, the classic engine should notice that the start index is 1, which is past the end of the empty input string, and then return -1. This is a bug in the classic engine.

This seems like a minor problem in the classic engine which will be fixed when we turn on SBE. However, it could also cause some noise in the fuzzers when enabling SBE. I propose that we accept this noise and defer fixing this ticket until next sprint. That way, we will have a bit of extra time to prepare backports to 4.4, 4.2, and 4.0 in order to keep all of the fuzzers happy. Does that sound good to you anton.korshunov and kyle.suarez?

Generated at Thu Feb 08 05:40:17 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.