[SERVER-9035] Regex "s" option causes mongo to use slow query plan Created: 20/Mar/13  Updated: 18/Sep/15  Resolved: 20/Feb/15

Status: Closed
Project: Core Server
Component/s: JavaScript, Querying
Affects Version/s: 2.2.3, 3.0.0-rc6
Fix Version/s: 3.1.0

Type: Bug Priority: Minor - P4
Reporter: David Burke Assignee: David Storch
Resolution: Done Votes: 1
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Amazon linux ami


Issue Links:
Depends
Duplicate
is duplicated by SERVER-16884 Regex/index optimization should allow... Closed
Related
related to CSHARP-862 Regex Queries /s suffix Closed
Backwards Compatibility: Fully Compatible
Operating System: ALL
Participants:

 Description   

When a $ regex query is issued with $options: "s" mongo does not choose the most appropriate query plan. For example the first query below returns in 15000 ms with nscanned: 932956 the same query without $options: "s" return in 1ms with "nscanned" : 61. Finally I ran the first query again with a hint to use the correct index but it gives a warning: unknown regex flag:s and uses the same index as the first query.

mongos> db.CA.find({ "a": "acme.com", "att": { $elemMatch: { "e.a": {$regex:"^x", $options: "s"} }, start: { $gte: new Date(1363737600000) }}}).explain()
{
        "cursor" : "BtreeCursor a_1_uId_1",
        "isMultiKey" : false,
        "n" : 0,
        "nscannedObjects" : 932956,
        "nscanned" : 932956,
        "nscannedObjectsAllPlans" : 2798869,
        "nscannedAllPlans" : 2798869,
        "scanAndOrder" : false,
        "indexOnly" : false,
        "nYields" : 729,
        "nChunkSkips" : 0,
        "millis" : 14817,
        "indexBounds" : {
                "a" : [
                        [
                                "acme.com",
                                "acme.com"
                        ]
                ],
                "uId" : [
                        [
                                {
                                        "$minElement" : 1
                                },
                                {
                                        "$maxElement" : 1
                                }
                        ]
                ]
        },
        "server" : "rs11",
        "millis" : 14817
}
mongos> db.CA.find({ "a": "acme.com", "att": { $elemMatch: { "e.a": {$regex:"^x"} }, start: { $gte: new Date(1363737600000) }}}).explain()
{
        "cursor" : "BtreeCursor a_1_att.e.a_1 multi",
        "isMultiKey" : true,
        "n" : 0,
        "nscannedObjects" : 60,
        "nscanned" : 61,
        "nscannedObjectsAllPlans" : 180,
        "nscannedAllPlans" : 181,
        "scanAndOrder" : false,
        "indexOnly" : false,
        "nYields" : 0,
        "nChunkSkips" : 0,
        "millis" : 1,
        "indexBounds" : {
                "a" : [
                        [
                                "acme.com",
                                "acme.com"
                        ]
                ],
                "att.e.a" : [
                        [
                                "x",
                                "y"
                        ],
                        [
                                /^x/,
                                /^x/
                        ]
                ]
        },
        "server" : "rs11",
        "millis" : 1
}
mongos> db.CA.find({ "a": "acme.com", "att": { $elemMatch: { "e.a": {$regex:"^x", $options: "s"} }, start: { $gte: new Date(1363737600000) }}}).hint("a_1_att.e.a_1").explain()
Wed Mar 20 18:09:53 warning: unknown regex flag:s
Wed Mar 20 18:09:53 warning: unknown regex flag:s
{
        "cursor" : "BtreeCursor a_1_att.e.a_1 multi",
        "isMultiKey" : true,
        "n" : 0,
        "nscannedObjects" : 1103778,
        "nscanned" : 1103778,
        "nscannedObjectsAllPlans" : 1103778,
        "nscannedAllPlans" : 1103778,
        "scanAndOrder" : false,
        "indexOnly" : false,
        "nYields" : 8163,
        "nChunkSkips" : 0,
        "millis" : 5359,
        "indexBounds" : {
                "a" : [
                        [
                                "acme.com",
                                "acme.com"
                        ]
                ],
                "att.e.a" : [
                        [
                                "",
                                {
 
                                }
                        ],
                        [
                                /^x/,
                                /^x/
                        ]
                ]
        },
        "server" : "rs11",
        "millis" : 5359
}
mongos>



 Comments   
Comment by Githook User [ 20/Feb/15 ]

Author:

{u'username': u'dstorch', u'name': u'David Storch', u'email': u'david.storch@10gen.com'}

Message: SERVER-9035 allow left-anchored regular expressions with the 's' option to use tighter index bounds
Branch: master
https://github.com/mongodb/mongo/commit/e93d4be469d665428d2e9bebd41c429732d3e8b4

Comment by Andre Spiegel [ 29/Jan/15 ]

So to reiterate, the option "s" causes mongo to not perform an optimization that would be perfectly legitimate even when the option "s" is present. The change should be approximately four lines of code.

Comment by David Storch [ 29/Jan/15 ]

Quoting from SERVER-16884, a more recent report of this issue:

The optimization to determine index bounds from a regular expression, which works by checking whether the regex is left-anchored and contains a constant prefix, is not applied when the regex has any option other than "m" or "x" (see index_bounds_builder.cpp).

There is no reason why the optimization shouldn't work for the option "s", which only changes the behavior of the . operator, and . can never be part of the constant prefix anyway.

So the suggestion is to perform the optimization even when "s" is present, just as with "m" and "x".

Comment by Thomas Rueckstiess [ 24/Dec/13 ]

See Craig Willson's suggestion of using Regex.IsMatch instead in this comment.

Comment by Søren Bramer Schmidt [ 03/Aug/13 ]

This issue is amplified by the C# drivers LINQ provider always using the s option with no way to turn it off. Ref https://groups.google.com/forum/#!topic/mongodb-user/IR-aWNgbudY

Generated at Thu Feb 08 03:19:09 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.