[SERVER-14196] Left-anchored regular expressions all of whose characters are non-special don't need to run the regex engine Created: 06/Jun/14  Updated: 06/Dec/22

Status: Backlog
Project: Core Server
Component/s: Querying
Affects Version/s: None
Fix Version/s: None

Type: Improvement Priority: Trivial - P5
Reporter: Richard Kreuter (Inactive) Assignee: Backlog - Query Optimization
Resolution: Unresolved Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
Assigned Teams:
Query Optimization
Backwards Compatibility: Fully Compatible
Participants:

 Description   

Left-anchored regular expressions all of whose characters are non-special don't need to run the regex engine, but can just do the appropriate range query upfrom the prefix upto the prefix plus 1 (so to speak).

IndexBoundsBuilder already includes code to find the longest non-special prefix of an anchored regular expression. Seems like it ought to be straightforward to check if the index bounds string is the same length as the regex string plus 1 (for the caret) somewhere downstream in query.

No idea if this will make any use case appreciably faster, but it's an obvious "missing" optimization that should be easy to implement and maintain.

Presumably one way to test for the existence of (and so regressions in) this improvement would be to compare the explain() plans for these two queries:

db.foo.find({s:/^abc/});
db.foo.find({s:/^abc[qz]/});


Generated at Thu Feb 08 03:34:08 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.