Loading...

XML

Word

Printable

JSON

Type: New Feature
Resolution: Unresolved
Priority: Major - P3
Fix Version/s: None
Affects Version/s: None
Component/s: Text Search
Labels:
None

Assigned Teams:

Query Optimization
Case:
Confidence Status:
None
Work Order:
3
CAR Domain/s:
None

Aha! Reference:
None
Tracking Level:
None
Risk Status:
None
Exec Notes:
None
Goal Name(s):
None
Goal Link:
None

Using a REGEX for a String.contains search is slow. Text search only works on word boundaries, so it does not yield any results for partial string matches.

If MongoDB were to add an NGRAM Index (http://lucene.apache.org/solr/guide/7_1/tokenizers.html) then searches using String.contains would be as fast as a "prefix expression” a.k.a regex String.startsWith(/^/). Of course, people would have to be careful concerning index size, but maybe one could specify a maximum length for the field to index and if that length is exceeded on document inserting / updating the write operation would fail stating the reason for the failure ("string too long for ngram index with max size n").

Additionally, one would need to specify whether to automatically cast the field to either lowercase or uppercase when creating the index.

Assignee:: [DO NOT USE] Backlog - Query Optimization
Reporter:: Ronald Feicht
Participants:: [DO NOT USE] Backlog - Query Optimization, Ronald Feicht
Votes:: 8 Vote for this issue
Watchers:: 13 Start watching this issue

Created:: Dec 04 2017 01:58:49 PM UTC
Updated:: Dec 06 2022 03:45:17 AM UTC

Details

Description

Attachments

Forms

Activity

People

Dates