[SERVER-15235] Regex query returns incorrect results when index is present Created: 12/Sep/14 Updated: 11/Jul/16 Resolved: 12/Nov/14 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Querying |
| Affects Version/s: | 2.4.11, 2.6.3, 2.7.6 |
| Fix Version/s: | 2.8.0-rc1 |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Victor Hooi | Assignee: | David Storch |
| Resolution: | Done | Votes: | 1 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Attachments: |
|
||||||||||||||||||||||||||||
| Issue Links: |
|
||||||||||||||||||||||||||||
| Operating System: | ALL | ||||||||||||||||||||||||||||
| Steps To Reproduce: | See regex_different_results_with_index.js |
||||||||||||||||||||||||||||
| Participants: | |||||||||||||||||||||||||||||
| Description |
|
We've noticed a strange issue where a regex query on a field returns different results, depending on whether there is an index defined on that field or not. I've included a JS test file (regex_different_results_with_index.js) which reproduces the issue. The regex is left-anchored. Also, an equavilent regex expression (included in JS) works in both cases:
Looking at the regex expression, the result without the index (both "aa" and "bb" match) should be the correct one. When the index is present, only "aa" is returned, which is incorrect. I have tested this on 2.4.11 and 2.6.3. |
| Comments |
| Comment by Ramon Fernandez Marina [ 03/Mar/16 ] |
|
Thanks for opening a new ticket itwebtf@saxobank.com. The 2.8 release was renamed 3.0, so anything with a 2.8.0-rc0 fixVersion is included in all 3.x versions. |
| Comment by ITWEBTF SAXOBANK [ 03/Mar/16 ] |
|
Hi Ramon, This ticket is market with 2.8.0-rc0, but I know it is in 3.x releases.Could you inform me about what exact 3.x release this bug fix went into? I have created a new ticket SERVER-22939. Regards, |
| Comment by Ramon Fernandez Marina [ 03/Mar/16 ] |
|
itwebtf@saxobank.com, this fix was published in a release, so in order to avoid confusion we need a new ticket if you think this behavior needs to change. If you feel none of the ones you've opened already are good candidates please feel free to open a new one. Thanks, |
| Comment by ITWEBTF SAXOBANK [ 02/Mar/16 ] |
|
This must be reopened. The problem with this fix is that, even when I have a regular expression that has a left-anchor and an escaped '|' character, that query is recognized as being "non-simple". This is wrong, and it causes unnecessarily that backwards compatibility is broken and performance is significantly degraded. See SERVER-22872 and SERVER-22890. |
| Comment by Githook User [ 12/Nov/14 ] |
|
Author: {u'username': u'dstorch', u'name': u'David Storch', u'email': u'david.storch@10gen.com'}Message: |
| Comment by J Rassi [ 03/Nov/14 ] |
|
No. This issues exposes a bug in the code that parses regular expression query predicates to generate index bounds, for regular expressions that have multiple left-anchor subexpressions joined by an OR. In the example expression /^a(a|$)|^b/, the index bounds ["a", "b") are generated, which causes the query to miss all documents that start with 'b'. The problem is that the parser looks at the the first two characters ('^' and then 'a'), and then incorrectly stops at the third character (the left paren) and reports that the regular expression is a left-anchored expression with prefix "a". |
| Comment by Tom Robinson [ 03/Nov/14 ] |
|
Could this issue be related: http://stackoverflow.com/questions/26631352/can-i-do-a-mongodb-starts-with-query-on-an-indexed-subdocument-field/26635039?noredirect=1#comment41905713_26635039 |