[SERVER-84338] Top level $or queries may lead to invalid SBE plan cache entry which returns wrong results Created: 20/Dec/23  Updated: 24/Jan/24  Resolved: 18/Jan/24

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: 7.1.0, 7.0.4, 7.3.0-rc0, 7.2.0
Fix Version/s: 7.2.1, 7.3.0-rc0, 7.0.6

Type: Bug Priority: Critical - P2
Reporter: Mika Fischer Assignee: Ben Shteinfeld
Resolution: Fixed Votes: 0
Labels: auto-reverted
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: File SERVER-84338-repro-2.js     File SERVER-84338-repro.js     File index.mjs    
Issue Links:
Backports
Problem/Incident
Related
related to SERVER-84728 Allow MatchExpression $or -> $in rewr... Investigating
Assigned Teams:
Query Optimization
Backwards Compatibility: Fully Compatible
Operating System: ALL
Backport Requested:
v7.2, v7.0
Steps To Reproduce:

// This is the original reproduction steps provided by the customer.

import { MongoClient, UUID } from "mongodb";
const client = await MongoClient.connect("mongodb://localhost:27017/");
const db = client.db("test");
const coll = db.collection("test");for (let i = 0; i < 10; i++) {
  const _id = new UUID();
  const foo = new UUID();
  await coll.insertOne({ _id, foo });
  const query = { _id, foo };
  const singleResult = await coll.findOne(query);
{{  const orResult = await coll.findOne(

{ $or: [query, query] }

);}}
  const broken = !singleResult || !orResult;
  console.log(
    `${i
      .toFixed()
      .padStart(3)} single=${!Unable to render embedded object: File (singleResult} or=${) not found.!orResult} broken=${broken}`
  );
}
await db.dropDatabase();
await client.close();

-------------------------------------------

With MongoDB 7.04

node .\index.mjs
  0 single=true or=true broken=false
  1 single=true or=false broken=true
  2 single=true or=false broken=true
  3 single=true or=false broken=true
  4 single=true or=false broken=true
  5 single=true or=false broken=true
  6 single=true or=false broken=true
  7 single=true or=false broken=true
  8 single=true or=false broken=true
  9 single=true or=false broken=true

-------------------------------------------

With MongoDB 6.0.12

 node .\index.mjs
  0 single=true or=true broken=false
  1 single=true or=true broken=false
  2 single=true or=true broken=false
  3 single=true or=true broken=false
  4 single=true or=true broken=false
  5 single=true or=true broken=false
  6 single=true or=true broken=false
  7 single=true or=true broken=false
  8 single=true or=true broken=false
  9 single=true or=true broken=false

Sprint: QE 2023-12-25, QO 2024-01-08, QO 2024-01-22
Participants:
Linked BF Score: 162

 Description   

Observable Problem

When running a query/aggregation with a $or, the SBE plan cache entry may store an invalid plan. This results in subsequent executions of the query, with different constants, returning incorrect results. In particular, the cached plan may incorrectly filter out documents which should be returned. This problem is specific to SBE and the SBE plan cache. The classic engine is not affected.

Under 7.0.4, 7.1,7.2, this bug affects find() commands, since SBE is on by default.

In the v7.0 branch and master branch, this bug affects only aggregations, since SBE use is restricted to a smaller set of queries (see SERVER-83685).

Reproduction with resmoke

To reproduce the bug, see SERVER-84338-repro-2.js and run it with resmoke:

python3 buildscripts/resmoke.py run --suites=no_passthrough_with_mongod SERVER-84338-repro-2.js

The script will run several tests against mongod, reconfiguring its internalQueryFrameworkControl value to trySbeEngine, trySbeRestricted and forceClassicEngine. At the end, it will print out a list of queries and configurations that produced incorrect results.

Under default configuration, which is trySbeRestricted, both 7.0 and master can produce incorrect results for an aggregation of this form:

[
    {
        "$match" : {
            "$or" : [
                {
                    "_id" : 2,
                    "foo" : 2
                },
                {
                    "_id" : 2,
                    "foo" : 999
                }
            ]
        }
    },
    {
        "$count" : "count"
    }
]

 

Explanation of the bug

The bug has to do with the interaction between top-level $ors, the $or -> $in rewrite, and the plan cache.

A query with the filter of the form {$or: {_id: 1, foo: 1}, {_id: 1, foo:999}} comes in, and is optimized/normalized, and parameterized as usual. Optimizing this expression does not change the expression (the _id predicate is not pulled up).

Each branch of the OR is then planned separately, using an index scan on _id and a residual filter on foo.

The planner then realizes that the scans generated beneath the OR are identical, and collapses them into one in QueryPlannerAccess::collapseEquivalentScans().

When collapsing the scans, we combine the residual filters on foo into

{ $or: [ { foo: { $eq: 999.0 } }, { foo: { $eq: 1.0 } } ] } 

We then optimize this expression here which converts the $or to a $in here. The optimized residual predicate is:

{foo: {$in: [1, 999]}} 

The resulting $in does not preserve the parameter IDs assigned to each predicate on foo. This is the crux of the problem: that parameterization information has effectively been "lost."

At this point, the $in right hand side is effectively "baked" into the query plan which is cached. Subsequent executions will use 1, 999 in the residual predicate even if those are not the values provided.

The attached SERVER-84338-repro-2.js contains more details, as well as some information about queries which are similar but do not exhibit the bug.

Next Steps

The remaining tasks here are:

(a) Determine the expected/desired interaction between match expression optimization and parameterization. This was probably discussed during the plan cache development. The repro script includes a similar situation where the bug does not reproduce, but it is not clear whether this is intentional, or whether it is just by chance.

(b) Fix the bug.

(c) Determine a testing strategy so we can catch this type of bug on our own.

(d) Investigate whether there are other cases where we can store an invalid plan in the cache. Any time we optimize a match expression after parameterization, we are at risk of this bug occurring, since the optimization can effectively "lose" parameter markings. I did a quick check of the planning code and couldn't find other places where we call MatchExpression::optimize() besides the on linked above when collapsing scans beneath an OR.

Original Description Provided by Customer

Given a query, `findOne({$or: [query, query]})` does not yield the same results as `findOne(query)`.

It works fine in 6.0

Description of Attached Files

index.mjs: Original reproduction provided by customer

SERVER-84338-repro.js: Reproduction script from Kyle Suarez, which runs the server under several configurations.

SERVER-84338-repro-2.js: Final repro script from Ian B, which supersedes the prior two. This file includes detailed notes on the bug, along with some different queries which do not exhibit the bug. This file, along with the resmoke command in the description, should be all that is needed to reproduce and understand the bug.



 Comments   
Comment by Ben Shteinfeld [ 18/Jan/24 ]

Original fix and backports were reverted due to a test failure in a timeseries passthrough suite which interacted poorly with the planCacheClear command. I've re-submitted the fix into master (7.3), 7.2, and 7.0 branches.

Comment by Githook User [ 18/Jan/24 ]

Author:

{'name': 'Ben Shteinfeld', 'email': 'ben.shteinfeld@mongodb.com', 'username': 'bshteinfeld'}

Message: SERVER-84338 Avoid $or->$in rewrite if operand is a parameter (2nd attempt) (#17992)

GitOrigin-RevId: eebb08ddd00c5b94055e19d30ad22c697d983e1c
Branch: v7.0
https://github.com/mongodb/mongo/commit/dfad62cd9ec44fd265e256ea199f342f4c5cfa02

Comment by Githook User [ 16/Jan/24 ]

Author:

{'name': 'Ben Shteinfeld', 'email': 'ben.shteinfeld@mongodb.com', 'username': 'bshteinfeld'}

Message: SERVER-84338 Avoid $or->$in rewrite if operand is a parameter (2nd attempt) (#17992)
Branch: v7.2
https://github.com/mongodb/mongo/commit/c6e4ede5cc8a57f887c2e3ba149ec6d89a5de5c2

Comment by Githook User [ 12/Jan/24 ]

Author:

{'name': 'Ben Shteinfeld', 'email': 'ben.shteinfeld@mongodb.com', 'username': 'bshteinfeld'}

Message: SERVER-84338 Avoid $or->$in rewrite if operand is a parameter (#17992)

GitOrigin-RevId: 31710a31b24c5b2d62dd1363567b2541dbfeee27
Branch: master
https://github.com/mongodb/mongo/commit/20cf2d127b94e0c7d798690edc9aa5546eacd745

Comment by Githook User [ 11/Jan/24 ]

Author:

{'name': 'Ben Shteinfeld', 'email': 'ben.shteinfeld@mongodb.com', 'username': 'bshteinfeld'}

Message: Revert "SERVER-84338 Avoid $or->$in rewrite if operand is a parameter (#17885)"

This reverts commit 375c60714f242e8024fb10adb31aab94ac454fa1.
Branch: v7.2
https://github.com/mongodb/mongo/commit/43ad1003471f5887edf1f8adc868260a38ca3149

Comment by Githook User [ 11/Jan/24 ]

Author:

{'name': 'Ben Shteinfeld', 'email': 'ben.shteinfeld@mongodb.com', 'username': 'bshteinfeld'}

Message: Revert "SERVER-84338 Avoid $or->$in rewrite if operand is a parameter (#17885)"

This reverts commit a57bdc3ccc66c1016928c7f7d5292705a32a9181.

GitOrigin-RevId: 6c185c0b4028672ec15f476fe1798cc2260d1df0
Branch: v7.0
https://github.com/mongodb/mongo/commit/dd436240ed58ea9cdac505061b0852d16875f9ee

Comment by Githook User [ 11/Jan/24 ]

Author:

{'name': 'auto-revert-processor', 'email': 'dev-prod-dag@mongodb.com', 'username': ''}

Message: Revert "SERVER-84338 Avoid $or->$in rewrite if operand is a parameter (#17885)"

This reverts commit b96ac705e7c6801d9ecfd7b879dc9819060982fe.

GitOrigin-RevId: 0ca548ff9d7b52c4ddbd0de078284fbc765e2e34
Branch: master
https://github.com/mongodb/mongo/commit/e74b6cdd6cc675c89093df23481f6ca4d8199476

Comment by Githook User [ 10/Jan/24 ]

Author:

{'name': 'Ben Shteinfeld', 'email': 'bshteinfeld@users.noreply.github.com', 'username': 'bshteinfeld'}

Message: SERVER-84338 Avoid $or->$in rewrite if operand is a parameter (#17885)

(cherry picked from commit b96ac705e7c6801d9ecfd7b879dc9819060982fe)

GitOrigin-RevId: a57bdc3ccc66c1016928c7f7d5292705a32a9181
Branch: v7.0
https://github.com/mongodb/mongo/commit/5a10503b0c1ec28f986640748582e95c2725d19c

Comment by Githook User [ 10/Jan/24 ]

Author:

{'name': 'Ben Shteinfeld', 'email': 'bshteinfeld@users.noreply.github.com', 'username': 'bshteinfeld'}

Message: SERVER-84338 Avoid $or->$in rewrite if operand is a parameter (#17885)

(cherry picked from commit b96ac705e7c6801d9ecfd7b879dc9819060982fe)
Branch: v7.2
https://github.com/mongodb/mongo/commit/375c60714f242e8024fb10adb31aab94ac454fa1

Comment by Githook User [ 10/Jan/24 ]

Author:

{'name': 'Ben Shteinfeld', 'email': 'bshteinfeld@users.noreply.github.com', 'username': 'bshteinfeld'}

Message: SERVER-84338 Avoid $or->$in rewrite if operand is a parameter (#17885)

GitOrigin-RevId: b96ac705e7c6801d9ecfd7b879dc9819060982fe
Branch: master
https://github.com/mongodb/mongo/commit/d3f0b18e7e2a3efa4cc2e6ed27934e09096f6574

Comment by Ian Boros [ 02/Jan/24 ]

Investigation is ongoing, but I've narrowed down the bug to a problematic interaction between match expression optimization and auto parameterization.

There's another patch that interacts with this: SERVER-83685, which landed in master and the 7.0 branch, due to be released in 7.0.5. Since 83685 turns off SBE for most queries, the bug should no longer affect the query provided in the ticket, or any find() queries in 7.0.5.

However, agregations with a similar predicate and an SBE-eligible $group or $lookup may still be affected despite the changes from 83685.

 

Comment by Chris Kelly [ 21/Dec/23 ]

Hi haser@videmo.de!

Thanks for sharing your findings here, and providing a clear reproducer script. We really appreciate it!

I was able to replicate this. It appears to happen regardless of the presence of indexes on any of these fields (and regardless of if they're created before or after the query). This also appears to affect some rapid releases like 6.3.0+ and possibly others.

Version Single Query $or Query (first execution) $or Query (subsequent executions) subsequent $or Query with indexes (made before or after the query)
5.0.23  
6.0.0  
6.0.12  
6.1.0  
6.2.0  
6.2.1  
6.3.0 (returns nothing)  (returns nothing) 
6.3.1 (returns nothing) (returns nothing)
7.0.0 (returns nothing) (returns nothing)
7.0.4 (returns nothing) (returns nothing)

I'll pass this to the relevant team to take a further look. Thank you again!

Comment by Yannick Haser [ 21/Dec/23 ]

Something breaks in the collection here. The first "$or" query is working, but as soon as the first "broken" query occurs every following $or-Query is broken. Even if all documents of the collection are deleted.... It can only be fixed with restarting the mongodb or dropping the collection. Afterwards it again works for 1 request and breaks afterwards.

Generated at Thu Feb 08 06:54:44 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.