[SERVER-84338] Top level $or queries may lead to invalid SBE plan cache entry which returns wrong results Created: 20/Dec/23 Updated: 24/Jan/24 Resolved: 18/Jan/24 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | None |
| Affects Version/s: | 7.1.0, 7.0.4, 7.3.0-rc0, 7.2.0 |
| Fix Version/s: | 7.2.1, 7.3.0-rc0, 7.0.6 |
| Type: | Bug | Priority: | Critical - P2 |
| Reporter: | Mika Fischer | Assignee: | Ben Shteinfeld |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | auto-reverted | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Attachments: |
|
||||||||||||||||
| Issue Links: |
|
||||||||||||||||
| Assigned Teams: |
Query Optimization
|
||||||||||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||||||||||
| Operating System: | ALL | ||||||||||||||||
| Backport Requested: |
v7.2, v7.0
|
||||||||||||||||
| Steps To Reproduce: | // This is the original reproduction steps provided by the customer. import { MongoClient, UUID } from "mongodb"; );}} ------------------------------------------- With MongoDB 7.04 node .\index.mjs ------------------------------------------- With MongoDB 6.0.12 node .\index.mjs |
||||||||||||||||
| Sprint: | QE 2023-12-25, QO 2024-01-08, QO 2024-01-22 | ||||||||||||||||
| Participants: | |||||||||||||||||
| Linked BF Score: | 162 | ||||||||||||||||
| Description |
|
Observable Problem When running a query/aggregation with a $or, the SBE plan cache entry may store an invalid plan. This results in subsequent executions of the query, with different constants, returning incorrect results. In particular, the cached plan may incorrectly filter out documents which should be returned. This problem is specific to SBE and the SBE plan cache. The classic engine is not affected. Under 7.0.4, 7.1,7.2, this bug affects find() commands, since SBE is on by default. In the v7.0 branch and master branch, this bug affects only aggregations, since SBE use is restricted to a smaller set of queries (see Reproduction with resmoke To reproduce the bug, see
The script will run several tests against mongod, reconfiguring its internalQueryFrameworkControl value to trySbeEngine, trySbeRestricted and forceClassicEngine. At the end, it will print out a list of queries and configurations that produced incorrect results. Under default configuration, which is trySbeRestricted, both 7.0 and master can produce incorrect results for an aggregation of this form:
Explanation of the bug The bug has to do with the interaction between top-level $ors, the $or -> $in rewrite, and the plan cache. A query with the filter of the form {$or: {_id: 1, foo: 1}, {_id: 1, foo:999}} comes in, and is optimized/normalized, and parameterized as usual. Optimizing this expression does not change the expression (the _id predicate is not pulled up). Each branch of the OR is then planned separately, using an index scan on _id and a residual filter on foo. The planner then realizes that the scans generated beneath the OR are identical, and collapses them into one in QueryPlannerAccess::collapseEquivalentScans(). When collapsing the scans, we combine the residual filters on foo into
We then optimize this expression here which converts the $or to a $in here. The optimized residual predicate is:
The resulting $in does not preserve the parameter IDs assigned to each predicate on foo. This is the crux of the problem: that parameterization information has effectively been "lost." At this point, the $in right hand side is effectively "baked" into the query plan which is cached. Subsequent executions will use 1, 999 in the residual predicate even if those are not the values provided. The attached Next Steps The remaining tasks here are: (a) Determine the expected/desired interaction between match expression optimization and parameterization. This was probably discussed during the plan cache development. The repro script includes a similar situation where the bug does not reproduce, but it is not clear whether this is intentional, or whether it is just by chance. (b) Fix the bug. (c) Determine a testing strategy so we can catch this type of bug on our own. (d) Investigate whether there are other cases where we can store an invalid plan in the cache. Any time we optimize a match expression after parameterization, we are at risk of this bug occurring, since the optimization can effectively "lose" parameter markings. I did a quick check of the planning code and couldn't find other places where we call MatchExpression::optimize() besides the on linked above when collapsing scans beneath an OR. Original Description Provided by Customer Given a query, `findOne({$or: [query, query]})` does not yield the same results as `findOne(query)`. It works fine in 6.0 Description of Attached Files index.mjs: Original reproduction provided by customer
|
| Comments |
| Comment by Ben Shteinfeld [ 18/Jan/24 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Original fix and backports were reverted due to a test failure in a timeseries passthrough suite which interacted poorly with the planCacheClear command. I've re-submitted the fix into master (7.3), 7.2, and 7.0 branches. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Githook User [ 18/Jan/24 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Author: {'name': 'Ben Shteinfeld', 'email': 'ben.shteinfeld@mongodb.com', 'username': 'bshteinfeld'}Message: GitOrigin-RevId: eebb08ddd00c5b94055e19d30ad22c697d983e1c | |||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Githook User [ 16/Jan/24 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Author: {'name': 'Ben Shteinfeld', 'email': 'ben.shteinfeld@mongodb.com', 'username': 'bshteinfeld'}Message: | |||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Githook User [ 12/Jan/24 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Author: {'name': 'Ben Shteinfeld', 'email': 'ben.shteinfeld@mongodb.com', 'username': 'bshteinfeld'}Message: GitOrigin-RevId: 31710a31b24c5b2d62dd1363567b2541dbfeee27 | |||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Githook User [ 11/Jan/24 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Author: {'name': 'Ben Shteinfeld', 'email': 'ben.shteinfeld@mongodb.com', 'username': 'bshteinfeld'}Message: Revert " This reverts commit 375c60714f242e8024fb10adb31aab94ac454fa1. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Githook User [ 11/Jan/24 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Author: {'name': 'Ben Shteinfeld', 'email': 'ben.shteinfeld@mongodb.com', 'username': 'bshteinfeld'}Message: Revert " This reverts commit a57bdc3ccc66c1016928c7f7d5292705a32a9181. GitOrigin-RevId: 6c185c0b4028672ec15f476fe1798cc2260d1df0 | |||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Githook User [ 11/Jan/24 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Author: {'name': 'auto-revert-processor', 'email': 'dev-prod-dag@mongodb.com', 'username': ''}Message: Revert " This reverts commit b96ac705e7c6801d9ecfd7b879dc9819060982fe. GitOrigin-RevId: 0ca548ff9d7b52c4ddbd0de078284fbc765e2e34 | |||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Githook User [ 10/Jan/24 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Author: {'name': 'Ben Shteinfeld', 'email': 'bshteinfeld@users.noreply.github.com', 'username': 'bshteinfeld'}Message: (cherry picked from commit b96ac705e7c6801d9ecfd7b879dc9819060982fe) GitOrigin-RevId: a57bdc3ccc66c1016928c7f7d5292705a32a9181 | |||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Githook User [ 10/Jan/24 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Author: {'name': 'Ben Shteinfeld', 'email': 'bshteinfeld@users.noreply.github.com', 'username': 'bshteinfeld'}Message: (cherry picked from commit b96ac705e7c6801d9ecfd7b879dc9819060982fe) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Githook User [ 10/Jan/24 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Author: {'name': 'Ben Shteinfeld', 'email': 'bshteinfeld@users.noreply.github.com', 'username': 'bshteinfeld'}Message: GitOrigin-RevId: b96ac705e7c6801d9ecfd7b879dc9819060982fe | |||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Ian Boros [ 02/Jan/24 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Investigation is ongoing, but I've narrowed down the bug to a problematic interaction between match expression optimization and auto parameterization. There's another patch that interacts with this: However, agregations with a similar predicate and an SBE-eligible $group or $lookup may still be affected despite the changes from 83685.
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Chris Kelly [ 21/Dec/23 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Hi haser@videmo.de! Thanks for sharing your findings here, and providing a clear reproducer script. We really appreciate it! I was able to replicate this. It appears to happen regardless of the presence of indexes on any of these fields (and regardless of if they're created before or after the query). This also appears to affect some rapid releases like 6.3.0+ and possibly others.
I'll pass this to the relevant team to take a further look. Thank you again! | |||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Yannick Haser [ 21/Dec/23 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Something breaks in the collection here. The first "$or" query is working, but as soon as the first "broken" query occurs every following $or-Query is broken. Even if all documents of the collection are deleted.... It can only be fixed with restarting the mongodb or dropping the collection. Afterwards it again works for 1 request and breaks afterwards. |