[SERVER-84013] Incorrect results for index scan plan on query with duplicate predicates in nested $or Created: 08/Dec/23  Updated: 03/Jan/24  Resolved: 14/Dec/23

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: 7.0.3, 6.0.12, 5.0.23, 7.0.4
Fix Version/s: 7.2.1, 7.3.0-rc0, 7.0.5, 6.0.13, 5.0.24

Type: Bug Priority: Critical - P2
Reporter: Ben Shteinfeld Assignee: Ben Shteinfeld
Resolution: Fixed Votes: 0
Labels: 7.0-release-blocker
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Backports
Related
is related to SERVER-83602 $or -> $in MatchExpression rewrite sh... Closed
Backwards Compatibility: Fully Compatible
Operating System: ALL
Backport Requested:
v7.2, v7.0, v6.0, v5.0
Sprint: QO 2023-12-11, QO 2023-12-25
Participants:

 Comments   
Comment by Githook User [ 03/Jan/24 ]

Author:

{'name': 'Ben Shteinfeld', 'email': 'ben.shteinfeld@mongodb.com', 'username': 'bshteinfeld'}

Message: SERVER-84013 Avoid invoking MatchExpression::optimize() on children of
$or in the subplanner.

The subplanner invokes the query planner for each branch of a rooted
$or. Doing this requires constructing a CanonicalQuery representing a
single branch of the $or. A side-effect of cosntructing a CanonicalQuery
is that the MatchExpression is optimized. The subplanner assumes that
both of these MatchExpressions are identical and makes decisions about
index bounds based on the order of predicates in the MatchExpression.
There are queries (as demostrated in the regression test this patch
introduces) for which this assumption is not true and leads to
incorrect index bounds and thus results.

This patch fixes the bug by ensuring that the subplanner does not
invoke MatchExpression optimization a second time when constructing a
copy of the children of the $or.

(cherry picked from commit ac1c1f25667e0845964b61cb6aadca648485d54c)
Branch: v7.2
https://github.com/mongodb/mongo/commit/420c9fe4741fd54e3d94d805bd65a39460b72d8b

Comment by Githook User [ 15/Dec/23 ]

Author:

{'name': 'Ben Shteinfeld', 'email': 'ben.shteinfeld@mongodb.com', 'username': 'bshteinfeld'}

Message: SERVER-84013 Avoid invoking MatchExpression::optimize() on children of
$or in the subplanner.

The subplanner invokes the query planner for each branch of a rooted
$or. Doing this requires constructing a CanonicalQuery representing a
single branch of the $or. A side-effect of cosntructing a CanonicalQuery
is that the MatchExpression is optimized. The subplanner assumes that
both of these MatchExpressions are identical and makes decisions about
index bounds based on the order of predicates in the MatchExpression.
There are queries (as demostrated in the regression test this patch
introduces) for which this assumption is not true and leads to
incorrect index bounds and thus results.

This patch fixes the bug by ensuring that the subplanner does not
invoke MatchExpression optimization a second time when constructing a
copy of the children of the $or.

(cherry picked from commit 854ce65ffc67bbefdcfba7d7286a44e8ea5ad1a6)

GitOrigin-RevId: 1126dea253e78bdd05bee2bc6c51604bf84c2578
Branch: v5.0
https://github.com/mongodb/mongo/commit/bb2eac7ea20d01158147a9187205b1482a8d07d2

Comment by Githook User [ 15/Dec/23 ]

Author:

{'name': 'Ben Shteinfeld', 'email': 'ben.shteinfeld@mongodb.com', 'username': 'bshteinfeld'}

Message: SERVER-84013 Avoid invoking MatchExpression::optimize() on children of
$or in the subplanner.

The subplanner invokes the query planner for each branch of a rooted
$or. Doing this requires constructing a CanonicalQuery representing a
single branch of the $or. A side-effect of cosntructing a CanonicalQuery
is that the MatchExpression is optimized. The subplanner assumes that
both of these MatchExpressions are identical and makes decisions about
index bounds based on the order of predicates in the MatchExpression.
There are queries (as demostrated in the regression test this patch
introduces) for which this assumption is not true and leads to
incorrect index bounds and thus results.

This patch fixes the bug by ensuring that the subplanner does not
invoke MatchExpression optimization a second time when constructing a
copy of the children of the $or.

GitOrigin-RevId: 854ce65ffc67bbefdcfba7d7286a44e8ea5ad1a6
Branch: v6.0
https://github.com/mongodb/mongo/commit/dcd8c73a04409a8ac680d59697922818b6a70f7c

Comment by Githook User [ 15/Dec/23 ]

Author:

{'name': 'Ben Shteinfeld', 'email': 'ben.shteinfeld@mongodb.com', 'username': 'bshteinfeld'}

Message: SERVER-84013 Avoid invoking MatchExpression::optimize() on children of
$or in the subplanner.

The subplanner invokes the query planner for each branch of a rooted
$or. Doing this requires constructing a CanonicalQuery representing a
single branch of the $or. A side-effect of cosntructing a CanonicalQuery
is that the MatchExpression is optimized. The subplanner assumes that
both of these MatchExpressions are identical and makes decisions about
index bounds based on the order of predicates in the MatchExpression.
There are queries (as demostrated in the regression test this patch
introduces) for which this assumption is not true and leads to
incorrect index bounds and thus results.

This patch fixes the bug by ensuring that the subplanner does not
invoke MatchExpression optimization a second time when constructing a
copy of the children of the $or.

GitOrigin-RevId: 1396b42063d30feff79e1adf4a9774f9601383b6
Branch: v7.0
https://github.com/mongodb/mongo/commit/82e5630496d8cc1fba7705111f1dc68e6aede247

Comment by Githook User [ 14/Dec/23 ]

Author:

{'name': 'Ben Shteinfeld', 'email': 'ben.shteinfeld@mongodb.com', 'username': 'bshteinfeld'}

Message: SERVER-84013 Avoid invoking MatchExpression::optimize() on children of
$or in the subplanner.

The subplanner invokes the query planner for each branch of a rooted
$or. Doing this requires constructing a CanonicalQuery representing a
single branch of the $or. A side-effect of cosntructing a CanonicalQuery
is that the MatchExpression is optimized. The subplanner assumes that
both of these MatchExpressions are identical and makes decisions about
index bounds based on the order of predicates in the MatchExpression.
There are queries (as demostrated in the regression test this patch
introduces) for which this assumption is not true and leads to
incorrect index bounds and thus results.

This patch fixes the bug by ensuring that the subplanner does not
invoke MatchExpression optimization a second time when constructing a
copy of the children of the $or.

GitOrigin-RevId: ac1c1f25667e0845964b61cb6aadca648485d54c
Branch: master
https://github.com/mongodb/mongo/commit/e85ad8897c810bd5bda16ebc94c2318a16563da5

Generated at Thu Feb 08 06:53:49 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.