[SERVER-37132] Negation of $in with regex can incorrectly plan from the cache, leading to missing query results Created: 13/Sep/18  Updated: 29/Oct/23  Resolved: 03/Oct/18

Status: Closed
Project: Core Server
Component/s: Querying
Affects Version/s: 3.2.21, 3.4.17, 3.6.7, 4.0.2, 4.1.2
Fix Version/s: 3.4.19, 3.6.9, 4.0.4, 4.1.4

Type: Bug Priority: Critical - P2
Reporter: David Storch Assignee: Bernard Gorman
Resolution: Fixed Votes: 0
Labels: afz
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Backports
Depends
Backwards Compatibility: Fully Compatible
Operating System: ALL
Backport Requested:
v4.0, v3.6, v3.4, v3.2
Steps To Reproduce:

(function() {
    "use strict";
 
    const coll = db.c;
    coll.drop();
    assert.commandWorked(coll.createIndex({a: 1}));
    assert.commandWorked(coll.createIndex({a: 1, b: 1}));
    assert.commandWorked(coll.insert({a: "foo"}));
 
    assert.eq(1, coll.find({a: {$not: {$in: [32, 33]}}}).itcount());
    assert.eq(1, coll.find({a: {$not: {$in: [32, 33]}}}).itcount());
    assert.eq(1, coll.find({a: {$not: {$in: [34, /bar/]}}}).itcount());
})();

Sprint: Query 2018-10-08
Participants:
Linked BF Score: 11

 Description   

A negation of a $in with a regex cannot be indexed. This is enforced by the query planner's index selection phase here:

https://github.com/mongodb/mongo/blob/bd38c69f5e6dc3136d20505d49f034c0927bf3e2/src/mongo/db/query/planner_ixselect.cpp#L467-L473

However, a negated $in with a regex, and one without a regex, are considered the same shape. This can be seen by verifying that they have the same queryHash, which was added to explain output in SERVER-36527:

MongoDB Enterprise > var hash1 = db.c.find({a: {$not: {$in: [32, 33]}}}).explain().queryPlanner.queryHash
MongoDB Enterprise > var hash2 = db.c.find({a: {$not: {$in: [34, /bar/]}}}).explain().queryPlanner.queryHash
MongoDB Enterprise > assert.eq(hash1, hash2)

As a result, the latter query can incorrectly use a plan cache entry created by the former query. The resulting plan has incorrect bounds which can erroneously exclude matching documents. A likely fix would be to add a discriminator to the plan cache key so that $not-$in predicates with regexes are encoded differently from $not-$in predicates without regexes.



 Comments   
Comment by Githook User [ 06/Nov/18 ]

Author:

{'name': 'Bernard Gorman', 'email': 'bernard.gorman@gmail.com', 'username': 'gormanb'}

Message: SERVER-37132 Negation of $in with regex can incorrectly plan from the cache, leading to missing query results

(cherry picked from commit e786e3a313b75a1fe8aa233ed09da2d2efbaf613)
Branch: v3.4
https://github.com/mongodb/mongo/commit/91c7456f24342e2b11b9cd486f7e8bc5d8b1f90e

Comment by Githook User [ 23/Oct/18 ]

Author:

{'name': 'Bernard Gorman', 'email': 'bernard.gorman@gmail.com', 'username': 'gormanb'}

Message: SERVER-37132 Negation of $in with regex can incorrectly plan from the cache, leading to missing query results
Branch: v3.6
https://github.com/mongodb/mongo/commit/d128ad0adefca668d37d6e65bab57c3dc88ca6d0

Comment by Githook User [ 08/Oct/18 ]

Author:

{'name': 'Bernard Gorman', 'email': 'bernard.gorman@gmail.com', 'username': 'gormanb'}

Message: SERVER-37132 Negation of $in with regex can incorrectly plan from the cache, leading to missing query results

(cherry picked from commit ba38c66d9483d2fb8a644772fa5dd0fff78a3cc9)
Branch: v4.0
https://github.com/mongodb/mongo/commit/1981c02a8bb76d5f6ab30a512c4f894a05452d3f

Comment by Githook User [ 03/Oct/18 ]

Author:

{'name': 'Bernard Gorman', 'email': 'bernard.gorman@gmail.com', 'username': 'gormanb'}

Message: SERVER-37132 Negation of $in with regex can incorrectly plan from the cache, leading to missing query results
Branch: master
https://github.com/mongodb/mongo/commit/ba38c66d9483d2fb8a644772fa5dd0fff78a3cc9

Comment by Ian Whalen (Inactive) [ 20/Sep/18 ]

bernard.gorman assigning into this sprint so it doesn't drag on too long, but you should also spend tomorrow's BF Friday looking at this. Hoping this is easy because it should come down to adding a plan cache discriminator.

Generated at Thu Feb 08 04:45:04 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.