[SERVER-12639] Performance regression with RegEx Created: 06/Feb/14  Updated: 11/Jul/16  Resolved: 19/Feb/14

Status: Closed
Project: Core Server
Component/s: Querying
Affects Version/s: 2.5.5
Fix Version/s: 2.6.0-rc0

Type: Bug Priority: Major - P3
Reporter: rgpublic Assignee: David Storch
Resolution: Done Votes: 0
Labels: 26qa
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Duplicate
is duplicated by SERVER-12673 index bounds incorrect on compound in... Closed
Related
Operating System: ALL
Participants:

 Description   

I recently tried to upgrade from 2.4.9 to 2.5.5 and I'm experiencing catastrophic regex query times with 2.5.5 so I had to downgrade again. My query simply is sth. like this:
{
"folder":/^weblogic/,
"type":"file"
}

2.4.9 gives me this query plan:
"cursor": "BtreeCursor type_folder multi",
"nscanned": NumberInt(2583),

2.5.5 gives this query plan:
"cursor": "Complex Plan",
"nscanned": NumberInt(8470527),

This plan is then running for literally ages
I wonder why 2.5.5 isn't even considering my "type_folder" index for this simple query.

Unfortunately I cannot provide test data because I could only reproduce the problem with our live database which data is quite large and non-disclosable. Nevertheless here is the full explain plan for 2.5.5 in case you can discern anything:

{
"cursor": "Complex Plan",
"n": NumberInt(2582),
"nscannedObjects": NumberInt(0),
"nscanned": NumberInt(8470527),
"nscannedObjectsAllPlans": NumberInt(0),
"nscannedAllPlans": NumberInt(8470527),
"nYields": NumberInt(66196),
"nChunkSkips": NumberInt(0),
"millis": NumberInt(4006),
"allPlans": [

{ "cursor": "Complex Plan", "n": NumberInt(2582), "nscannedObjects": NumberInt(0), "nscanned": NumberInt(8470527), "nChunkSkips": NumberInt(0) }

],
"server": "edison:27017",
"stats": {
"type": "SORT",
"works": NumberInt(8473112),
"yields": NumberInt(66196),
"unyields": NumberInt(66196),
"invalidates": NumberInt(0),
"advanced": NumberInt(2582),
"needTime": NumberInt(8470528),
"needFetch": NumberInt(0),
"isEOF": NumberInt(1),
"forcedFetches": NumberInt(0),
"children": [
{
"type": "FETCH",
"works": NumberInt(8470528),
"yields": NumberInt(66196),
"unyields": NumberInt(66196),
"invalidates": NumberInt(0),
"advanced": NumberInt(2582),
"needTime": NumberInt(8467945),
"needFetch": NumberInt(0),
"isEOF": NumberInt(1),
"alreadyHasObj": NumberInt(0),
"forcedFetches": NumberInt(0),
"matchTested": NumberInt(0),
"children": [
{
"type": "AND_HASH",
"works": NumberInt(8470527),
"yields": NumberInt(66196),
"unyields": NumberInt(66196),
"invalidates": NumberInt(0),
"advanced": NumberInt(2582),
"needTime": NumberInt(8467945),
"needFetch": NumberInt(0),
"isEOF": NumberInt(1),
"flaggedButPassed": NumberInt(0),
"flaggedInProgress": NumberInt(0),
"mapAfterChild_0": NumberInt(3459),
"children": [
{
"type": "IXSCAN",
"works": NumberInt(3460),
"yields": NumberInt(66196),
"unyields": NumberInt(66196),
"invalidates": NumberInt(0),
"advanced": NumberInt(3459),
"needTime": NumberInt(0),
"needFetch": NumberInt(0),
"isEOF": NumberInt(1),
"keyPattern": "

{ folder: 1 }

",
"bounds":

{ "folder": [ [ "weblogic", "weblogid" ], [ "/^weblogic/", "/^weblogic/" ] ] }

,
"isMultiKey": NumberInt(0),
"yieldMovedCursor": NumberInt(0),
"dupsTested": NumberInt(0),
"dupsDropped": NumberInt(0),
"seenInvalidated": NumberInt(0),
"matchTested": NumberInt(0),
"keysExamined": NumberInt(3460),
"children": [

]
},
{
"type": "IXSCAN",
"works": NumberInt(8467067),
"yields": NumberInt(66196),
"unyields": NumberInt(66196),
"invalidates": NumberInt(0),
"advanced": NumberInt(8467067),
"needTime": NumberInt(0),
"needFetch": NumberInt(0),
"isEOF": NumberInt(1),
"keyPattern": "

{ type: 1 }

",
"bounds":

{ "type": [ [ "file", "file" ] ] }

,
"isMultiKey": NumberInt(0),
"yieldMovedCursor": NumberInt(0),
"dupsTested": NumberInt(0),
"dupsDropped": NumberInt(0),
"seenInvalidated": NumberInt(0),
"matchTested": NumberInt(0),
"keysExamined": NumberInt(8467067),
"children": [

]
}
]
}
]
}
]
}
}



 Comments   
Comment by David Storch [ 19/Feb/14 ]

Hi rgpublic,

I believe the problem you ran into was fixed as a side-effect of commit d7d110bfa62c51cf0d50ca248cd5dbe05ea34208. In this commit, we re-ordered candidate query plans so that bad index intersection plans will never be chosen over faster single-index plans. We are continuing to work on this area of the query optimizer in order to ensure that index intersection plans are only used when they offer a performance advantage.

I'm resolving this ticket as Fixed per my comment above. Thanks for the detailed bug report!

Thanks,
Dave

Comment by rgpublic [ 10/Feb/14 ]

Thanks for investigating this. I cannot easily check the nightly at the moment, but I very much suspect mine to be the same bug. I've got (among other, unrelated ones) these 3 indices: folder, type, type_folder. In my case, it obviously isn't even necessary to do a regex query on both columns. One regular query and one regex query already trigger the bug. Glad to hear that it's fixed, because it made 2.5.5 completely unusable for us which was a bit disappointing because I was looking forward to the new intersection feature :-/ Finally, I wonder which change exactly (JIRA ticket?) fixed this so we might possibly close this one as DUPLICATE.

Comment by Davide Italiano [ 10/Feb/14 ]

I was able to reproduced the following problem with enron mail dataset. http://mongodb-enron-email.s3-website-us-east-1.amazonaws.com

This used to happen in 2.5.5 if you have 3 indexes on the fields "a" and "b":

{a: 1}
{b: 1}
{a: 1, b: 1}

Steps to reproduce on 2.5.5 after you restored the enron_mail dataset:

use enron_mail
db.messages.dropIndexes()
db.messages.ensureIndex({"headers.From": 1, "headers.Date": 1})
db.messages.ensureIndex({"headers.Date": 1})
db.messages.ensureIndex({"headers.From": 1})

then run the following query:

db.messages.find({"headers.From": /^[a-z]/, "headers.Date": "/^Fri"})

cursor.explain() on 2.5.5

{
        "cursor" : "Complex Plan",
        "n" : 0,
        "nscannedObjects" : 0,
        "nscanned" : 501513,
        "nscannedObjectsAllPlans" : 0,
        "nscannedAllPlans" : 501513,
        "nYields" : 3919,
        "nChunkSkips" : 0,
        "millis" : 1190,
        "server" : "ip-10-16-132-110:27017"
}

cursor.explain on HEAD (githash 57c49a91767327fedd3b22a1363480f1a7d9ff20)

{
        "cursor" : "BtreeCursor headers.From_1_headers.Date_1",
        "isMultiKey" : false,
        "n" : 0,
        "nscannedObjects" : 0,
        "nscanned" : 19966,
        "nscannedObjectsAllPlans" : 0,
        "nscannedAllPlans" : 19966,
        "scanAndOrder" : false,
        "indexOnly" : false,
        "nYields" : 0,
        "nChunkSkips" : 0,
        "millis" : 55,
        "indexBounds" : {
                "headers.From" : [
                        [
                                "",
                                {
 
                                }
                        ],
                        [
                                /^[a-z]/,
                                /^[a-z]/
                        ]
                ],
                "headers.Date" : [
                        [
                                "/^Fri",
                                "/^Fri"
                        ]
                ]
        },
        "server" : "ip-10-16-132-110:27017"
}

which is more or less what you would expect (55 ms vs 1032 ms). You can try grab the latest nightly build and see if the problem is fixed for you as well.

Generated at Thu Feb 08 03:29:07 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.