[SERVER-22890] 3.2.3 performance regression Created: 29/Feb/16  Updated: 14/Apr/16  Resolved: 29/Feb/16

Status: Closed
Project: Core Server
Component/s: Querying
Affects Version/s: 3.2.3
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: ITWEBTF SAXOBANK Assignee: David Storch
Resolution: Duplicate Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: Microsoft Word Mongo Performance.docx    
Issue Links:
Duplicate
duplicates SERVER-16622 RegEx query predicates using the | (v... Backlog
Related
is related to SERVER-15235 Regex query returns incorrect results... Closed
is related to SERVER-22872 Order by is not working in 3.2.3 Closed
Operating System: ALL
Steps To Reproduce:

See the description above

Participants:

 Description   

We have optimized core queries to work with few nscanned and low ms. This optimization works in 2.6.11 and 2.4.9.

However, in 3.2.3 performance regressed badly.

In 2.6.11 we have,
"nscanned" : NumberInt(11),
"nscannedObjects" : NumberInt(10),
"keyUpdates" : NumberInt(0),
"numYield" : NumberInt(0),
"millis" : NumberInt(0)

In 3.2.3 we have,
"keysExamined" : NumberInt(10762703),
"docsExamined" : NumberInt(2208753),
"numYield" : NumberInt(84084),
"millis" : NumberInt(33272),

See the query, indexes, collection statistics and full profiler input in the attached document.

Note that the documents in the PostKey2 collection are similar to the documents described in SERVER-22872.



 Comments   
Comment by ITWEBTF SAXOBANK [ 29/Mar/16 ]

Hi David,

I have to disagree. Recognizing an escaped '|' as a non-escaped '|' is a bug, so fixing that particular issue is not a hack - rather it is a step in the right direction.

I can see from the duplicates that other users have this problem, and I can partly understand why you would like to do a really good and maintainable bug fix for all these issues.

However, SERVER-16622 was reported in December '14 - more than a year ago - so it is about time to change the strategy on this.

I strongly suggest that you make the analysis of a character string constituting the regex less "simple", so that this (SERVER-22890) issue is fixed. It may not be the grand solution that fixes all "duplicates" but it is a step in the right direction.

Please re-open.

Regards,
Brian

Comment by David Storch [ 22/Mar/16 ]

Hi itwebtf@saxobank.com,

We could consider hacking a fix in which we recognize escaped "|" characters. Our engineering team has considered this proposal before. However, our view was that instead we should implement full regular expression parsing. Right now, we do some simple analysis of the character string constituting the regex rather than doing real regex parsing. A more comprehensive fix would also allow us to also tighten the bounds in cases where the "|" is escaped using the \Q...\E style, among other useful cases. Therefore, our preference is to continue to treat this as a duplicate of SERVER-16622. Also, take a look at SERVER-20432, which I believe is the same request for handling escaped "|".

Our apologies for the difficulty that this has caused, but I hope you understand our desire to fix this in a manner that we consider correct and maintainable going forward.

Best,
Dave

Comment by ITWEBTF SAXOBANK [ 01/Mar/16 ]

Hi Dave,

I see your point. However, after reading the "duplicate", it seems that this could be handles as two very different bugs, hence this is not a duplicate.

In my case, the pipe symbol is escaped, as you mentioned, which means that it should be looked up literally. In the "duplicate" the pipe symbols is supposed to be interpreted as a reg ex special symbol.

I guess that handling the escaped pipe symbols could be a simple fix, thus you can reopen this incident.

Comment by David Storch [ 29/Feb/16 ]

Hi itwebtf@saxobank.com,

As I described in my comment here on SERVER-22872, any regular expression with the "|" character will have loose index bounds in 3.0.x or 3.2.x versions of MongoDB. These loose index bounds are currently required for correctness in order to fix SERVER-15235.

SERVER-16622 is an open feature request to allow tighter index bounds in the cases where it is correct to do so. In your case, it should be correct to use the tighter bounds, since your regular expression escapes the "|" character. Therefore, I am closing this ticket as a duplicate of SERVER-16622. Please watch and vote for SERVER-16622.

Best,
Dave

Generated at Thu Feb 08 04:01:43 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.