[SERVER-60637] v4.0 appears to accept invalid regexes in filter Created: 12/Oct/21  Updated: 06/Dec/22  Resolved: 11/Feb/22

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: 4.0.27
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Ryan Egesdahl (Inactive) Assignee: Backlog - Query Execution
Resolution: Won't Do Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Gantt Dependency
Assigned Teams:
Query Execution
Operating System: ALL
Sprint: QE 2021-11-01, QE 2021-11-15, QE 2021-11-29, QE 2021-12-13, QE 2021-12-27, QE 2022-01-10, QE 2022-01-24
Participants:

 Description   

While working on BACKPORT-10572 to backport SERVER-60299 into the v4.0 branch, I noticed that the reproducer was not producing an exception. I worked with ksuarz@gmail.com, and it looks like the server is developing a query execution plan with the $regex filter even though the expression is invalid. For example, the original reproducer returns this plan:

MongoDB Enterprise > const t = db.jstests_regex;
MongoDB Enterprise > assert.writeOK(t.save({a: ["Johns email address is johnny@johnnydoessql.com. Priscilla manages the http://www.johnnydoessql.com site. She also manages the site http://jilldoessql.com and can be reached at 345.678.9999 She can be reached at (123) 456-7890 and her email address is prissy@johnnydoessql.com or prissy@jilldoessql.com."]}));
WriteResult({ "nInserted" : 1 })
MongoDB Enterprise > t.find({
...              a: {
...                  $regex:
...                      "(?JJJ)>?W((?<a>!)||(?<a><aR)|(?<a>W!| P(?<a>!) C(?<a>!)||(?<a>!)|;|(?<a>W!| P(?<a>!) C(?<a>!)||(?<a>!aR(?<a>!aR)|(?<a>!?!)|(?<a>W!| P(?<e>!) C(?<a>!)||(?<a>!aR)|(?<a>!?!)||(?<a>)|(?<a>!6]|P(?<a>!)|(?<aa>!?!);|(?<a>W!| P(?<a>!) C(?<a>!)||(?<a>!aR|(?<a>!?!);|(?<a>W!) (?<a>!)||(?<a>)|(?<a>||(?<a><a>W!| P(?<a>!) C(?<a>!)||(?<a>!aR(?<a>!aR)|(?<a>!?!);|(?<a>W(?<a>!) C(?<a>!)||(?<a>!aR)|(?<a>|1|t(?<a>)(?<a>!??<aR)|(?<a>!:aW ) C(?<a>!)||(?<a>!aR)|(?<a>!?!|(?<a>W!| P(?<a>!) C(?<a>!)||(?<a>!aR)|(?<a>!(?<a>)(?<a><aR)|(?<a>!:aW ) C(?<a>!)|;|(?<a>W!| P(?<a>!) C(?<a>!)||(?<a>!aR(?<a>!aR)|(?<a>!?!);|(?<aa>W!| P(?<a>!) C(?<a>!)||(?<a>!aR(?<a>!aR)|(?<a>!?!)|(?<a>W!| P(?<a>!) C(?<a>!)||(?<a>!aR)|(?<a>!?!)||(?<a>)|(?<a>!2]|P(?<a>!)|C(?<a>!()|(?<aa>!?!);|(?<a>W!| P(?<a>!) C(?<a>!)||(?<a>!aR|(?<a>!?!);|(?<a>W!) __P(?<a>!-||(?<a>)|(?<a>||(?<a><a>W!| P(?<a>!) C(?<a>!)||(?<a>!aR(?<a>!aR)|(?<a>!?!);|(?<a>W(?<a>!) C(?<a>!)||(?<a>!aR)|(?<a>|1|t(?<a>)(?<a>!??<a!) __P(?<a>!|(?<a>)|(?<a>!3]|P(?<a>!)|C(?<a>!)||(?<a>!aR)|(?<a>!?!);|(?<a>W!| P(?<a>!) C(?<a>!)|;|(?<a>W!| P(?<a>!) C(?<a>!)||(?<a>!aR(?<a>!aR)|(?<a>! m);|(?<a>W!| P(?<a>!) C(?<a>!)||(?<a>!aR)|(?<a>!?!)||(?<a>)|(?<a>!6]|P(?<a>!)|C(?<a>!()(?<aa>!?!);|(?<a>W!| P(?<a>!) C(?<a>!)||(?<a>!?!);|(?<a>W!) !)||(?<a>)|(?<a>!6]|P(?<a>!)|(?<aa>!?!);|(?<a>W!| P(?<a>!) C(?<a>!)||(?<a>!aR|(?<a>!?!);|(?<a>W!) (?<a>!)||(?<a>)|(?<a>||(?<a><a>W!| P(?<a>!) C(?<a>!)||(?<a>!aR(?<a>!aR)|(?<a>!?!);|(?<a>W(?<a>!) C(?<a>!)||(?<a>!aR)|(?<a>|1|t(?<a>)(?<a>!??<__P(?<a>!||(?<a>)|(?<a>||(?<a>!?<",
...                  $options: "i"
...              }
...          }).explain();
{
        "queryPlanner" : {
                "plannerVersion" : 1,
                "namespace" : "test.jstests_regex",
                "indexFilterSet" : false,
                "parsedQuery" : {
                        "a" : {
                                "$regex" : "(?JJJ)>?W((?<a>!)||(?<a><aR)|(?<a>W!| P(?<a>!) C(?<a>!)||(?<a>!)|;|(?<a>W!| P(?<a>!) C(?<a>!)||(?<a>!aR(?<a>!aR)|(?<a>!?!)|(?<a>W!| P(?<e>!) C(?<a>!)||(?<a>!aR)|(?<a>!?!)||(?<a>)|(?<a>!6]|P(?<a>!)|(?<aa>!?!);|(?<a>W!| P(?<a>!) C(?<a>!)||(?<a>!aR|(?<a>!?!);|(?<a>W!) (?<a>!)||(?<a>)|(?<a>||(?<a><a>W!| P(?<a>!) C(?<a>!)||(?<a>!aR(?<a>!aR)|(?<a>!?!);|(?<a>W(?<a>!) C(?<a>!)||(?<a>!aR)|(?<a>|1|t(?<a>)(?<a>!??<aR)|(?<a>!:aW ) C(?<a>!)||(?<a>!aR)|(?<a>!?!|(?<a>W!| P(?<a>!) C(?<a>!)||(?<a>!aR)|(?<a>!(?<a>)(?<a><aR)|(?<a>!:aW ) C(?<a>!)|;|(?<a>W!| P(?<a>!) C(?<a>!)||(?<a>!aR(?<a>!aR)|(?<a>!?!);|(?<aa>W!| P(?<a>!) C(?<a>!)||(?<a>!aR(?<a>!aR)|(?<a>!?!)|(?<a>W!| P(?<a>!) C(?<a>!)||(?<a>!aR)|(?<a>!?!)||(?<a>)|(?<a>!2]|P(?<a>!)|C(?<a>!()|(?<aa>!?!);|(?<a>W!| P(?<a>!) C(?<a>!)||(?<a>!aR|(?<a>!?!);|(?<a>W!) __P(?<a>!-||(?<a>)|(?<a>||(?<a><a>W!| P(?<a>!) C(?<a>!)||(?<a>!aR(?<a>!aR)|(?<a>!?!);|(?<a>W(?<a>!) C(?<a>!)||(?<a>!aR)|(?<a>|1|t(?<a>)(?<a>!??<a!) __P(?<a>!|(?<a>)|(?<a>!3]|P(?<a>!)|C(?<a>!)||(?<a>!aR)|(?<a>!?!);|(?<a>W!| P(?<a>!) C(?<a>!)|;|(?<a>W!| P(?<a>!) C(?<a>!)||(?<a>!aR(?<a>!aR)|(?<a>! m);|(?<a>W!| P(?<a>!) C(?<a>!)||(?<a>!aR)|(?<a>!?!)||(?<a>)|(?<a>!6]|P(?<a>!)|C(?<a>!()(?<aa>!?!);|(?<a>W!| P(?<a>!) C(?<a>!)||(?<a>!?!);|(?<a>W!) !)||(?<a>)|(?<a>!6]|P(?<a>!)|(?<aa>!?!);|(?<a>W!| P(?<a>!) C(?<a>!)||(?<a>!aR|(?<a>!?!);|(?<a>W!) (?<a>!)||(?<a>)|(?<a>||(?<a><a>W!| P(?<a>!) C(?<a>!)||(?<a>!aR(?<a>!aR)|(?<a>!?!);|(?<a>W(?<a>!) C(?<a>!)||(?<a>!aR)|(?<a>|1|t(?<a>)(?<a>!??<__P(?<a>!||(?<a>)|(?<a>||(?<a>!?<",
                                "$options" : "i"
                        }
                },
                "winningPlan" : {
                        "stage" : "COLLSCAN",
                        "filter" : {
                                "a" : {
                                        "$regex" : "(?JJJ)>?W((?<a>!)||(?<a><aR)|(?<a>W!| P(?<a>!) C(?<a>!)||(?<a>!)|;|(?<a>W!| P(?<a>!) C(?<a>!)||(?<a>!aR(?<a>!aR)|(?<a>!?!)|(?<a>W!| P(?<e>!) C(?<a>!)||(?<a>!aR)|(?<a>!?!)||(?<a>)|(?<a>!6]|P(?<a>!)|(?<aa>!?!);|(?<a>W!| P(?<a>!) C(?<a>!)||(?<a>!aR|(?<a>!?!);|(?<a>W!) (?<a>!)||(?<a>)|(?<a>||(?<a><a>W!| P(?<a>!) C(?<a>!)||(?<a>!aR(?<a>!aR)|(?<a>!?!);|(?<a>W(?<a>!) C(?<a>!)||(?<a>!aR)|(?<a>|1|t(?<a>)(?<a>!??<aR)|(?<a>!:aW ) C(?<a>!)||(?<a>!aR)|(?<a>!?!|(?<a>W!| P(?<a>!) C(?<a>!)||(?<a>!aR)|(?<a>!(?<a>)(?<a><aR)|(?<a>!:aW ) C(?<a>!)|;|(?<a>W!| P(?<a>!) C(?<a>!)||(?<a>!aR(?<a>!aR)|(?<a>!?!);|(?<aa>W!| P(?<a>!) C(?<a>!)||(?<a>!aR(?<a>!aR)|(?<a>!?!)|(?<a>W!| P(?<a>!) C(?<a>!)||(?<a>!aR)|(?<a>!?!)||(?<a>)|(?<a>!2]|P(?<a>!)|C(?<a>!()|(?<aa>!?!);|(?<a>W!| P(?<a>!) C(?<a>!)||(?<a>!aR|(?<a>!?!);|(?<a>W!) __P(?<a>!-||(?<a>)|(?<a>||(?<a><a>W!| P(?<a>!) C(?<a>!)||(?<a>!aR(?<a>!aR)|(?<a>!?!);|(?<a>W(?<a>!) C(?<a>!)||(?<a>!aR)|(?<a>|1|t(?<a>)(?<a>!??<a!) __P(?<a>!|(?<a>)|(?<a>!3]|P(?<a>!)|C(?<a>!)||(?<a>!aR)|(?<a>!?!);|(?<a>W!| P(?<a>!) C(?<a>!)|;|(?<a>W!| P(?<a>!) C(?<a>!)||(?<a>!aR(?<a>!aR)|(?<a>! m);|(?<a>W!| P(?<a>!) C(?<a>!)||(?<a>!aR)|(?<a>!?!)||(?<a>)|(?<a>!6]|P(?<a>!)|C(?<a>!()(?<aa>!?!);|(?<a>W!| P(?<a>!) C(?<a>!)||(?<a>!?!);|(?<a>W!) !)||(?<a>)|(?<a>!6]|P(?<a>!)|(?<aa>!?!);|(?<a>W!| P(?<a>!) C(?<a>!)||(?<a>!aR|(?<a>!?!);|(?<a>W!) (?<a>!)||(?<a>)|(?<a>||(?<a><a>W!| P(?<a>!) C(?<a>!)||(?<a>!aR(?<a>!aR)|(?<a>!?!);|(?<a>W(?<a>!) C(?<a>!)||(?<a>!aR)|(?<a>|1|t(?<a>)(?<a>!??<__P(?<a>!||(?<a>)|(?<a>||(?<a>!?<",
                                        "$options" : "i"
                                }
                        },
                        "direction" : "forward"
                },
                "rejectedPlans" : [ ]
        },
        "serverInfo" : {
                "host" : "ip-10-122-0-119",
                "port" : 27017,
                "version" : "4.0.27-12-g6e9844c",
                "gitVersion" : "6e9844c2afb662d9ef1f0aa50f2b4bf32864f097"
        },
        "ok" : 1
}

I tested with a much simpler invalid regex as well:

MongoDB Enterprise > t.find({a: {$regex: "(a)("}}).explain()
{
        "queryPlanner" : {
                "plannerVersion" : 1,
                "namespace" : "test.jstests_regex",
                "indexFilterSet" : false,
                "parsedQuery" : {
                        "a" : {
                                "$regex" : "(a)("
                        }
                },
                "winningPlan" : {
                        "stage" : "COLLSCAN",
                        "filter" : {
                                "a" : {
                                        "$regex" : "(a)("
                                }
                        },
                        "direction" : "forward"
                },
                "rejectedPlans" : [ ]
        },
        "serverInfo" : {
                "host" : "ip-10-122-0-119",
                "port" : 27017,
                "version" : "4.0.27-12-g6e9844c",
                "gitVersion" : "6e9844c2afb662d9ef1f0aa50f2b4bf32864f097"
        },
        "ok" : 1
}

This happens regardless of whether there is a document in the collection to match against (I tested both ways).



 Comments   
Comment by Joe Kanaan [ 11/Feb/22 ]

4.0 is soon to be EOL.

Generated at Thu Feb 08 05:50:21 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.