XMLWordPrintable

    Details

    • Type: New Feature
    • Status: Open
    • Priority: Major - P3
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: Backlog
    • Component/s: Text Search
    • Labels:
      None
    • Case:

      Description

      Using a REGEX for a String.contains search is slow. Text search only works on word boundaries, so it does not yield any results for partial string matches.

      If MongoDB were to add an NGRAM Index (http://lucene.apache.org/solr/guide/7_1/tokenizers.html) then searches using String.contains would be as fast as a "prefix expression” a.k.a regex String.startsWith(/^/). Of course, people would have to be careful concerning index size, but maybe one could specify a maximum length for the field to index and if that length is exceeded on document inserting / updating the write operation would fail stating the reason for the failure ("string too long for ngram index with max size n").

      Additionally, one would need to specify whether to automatically cast the field to either lowercase or uppercase when creating the index.

        Attachments

          Activity

            People

            • Votes:
              5 Vote for this issue
              Watchers:
              9 Start watching this issue

              Dates

              • Created:
                Updated: