[SERVER-13199] regex uses too strict index bounds when | is after a metacharacter Created: 14/Mar/14  Updated: 29/Jan/15  Resolved: 29/Jan/15

Status: Closed
Project: Core Server
Component/s: Querying
Affects Version/s: 2.4.9, 2.6.0-rc1
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: David Glasser Assignee: David Storch
Resolution: Duplicate Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Duplicate
duplicates SERVER-15235 Regex query returns incorrect results... Closed
Operating System: ALL
Participants:

 Description   

This occurs in 2.4.9 and 2.6.0-rc1.

> db.a.insert({a: "foo"})
> db.a.find({a: /^abc+|foo/})
{ "_id" : ObjectId("53228d743704a19f9cb03ea6"), "a" : "foo" }
> db.a.ensureIndex({a: 1})
> db.a.find({a: /^abc+|foo/})

The problem is that simpleRegex returns success when it hits most metacharacters, and doesn't continue on to find the |.

I suggest that you check the entire regex first for | and immediately bail out with an empty string if it's find. (This is not 100% optimal because the | could have been escaped, but it's certainly correct.)



 Comments   
Comment by David Storch [ 29/Jan/15 ]

This was fixed in commit 866d3851fcb in SERVER-15235. The fix will be available in version 3.0.0. Closing as a duplicate.

Comment by David Glasser [ 14/Mar/14 ]

I'm not sure why this got tagged as JavaScript. It's a C++ bug, in simpleRegex (in index_bounds_builder.cpp in 2.6).

Generated at Thu Feb 08 03:30:57 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.