Details

    • Type: New Feature New Feature
    • Status: Open Open
    • Priority: Major - P3 Major - P3
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: Planning Bucket A
    • Component/s: Indexing
    • Labels:
      None
    • # Replies:
      22
    • Last comment by Customer:
      true
    • Documentation changes needed?:
      Yes
    • Driver changes needed?:
      No driver changes needed

      Description

      potential syntax:

      db.foo.ensureIndex(

      { name : 1 }

      ,

      { caseInsensitive : true }

      )
      db.foo.ensureIndex(

      { name : 1 }

      ,

      { caseInsensitive : true , locale : "FR" }

      )
      db.foo.ensureIndex(

      { name : 1 }

      ,

      { caseInsensitive : true , localeKey : "user.country" }

      )

      db.foo.ensureIndex(

      { name : 1 }

      ,

      { caseInsensitive : [ "name" ] }

      )

      reminder, you can aways do this for now:
      { name :

      { real : "Eliot" , sort : "eliot" }

      }
      ensuerIndex(

      { "name.sort" : 1 }

      )

        Issue Links

          Activity

          Hide
          Arkadiy Kukarkin
          added a comment -

          The lowercase index suggestion (as well as the current lowercase field workaround) don't really work outside ascii. Given that

          Reichwaldstraße → reichwaldstrasse
          REICHWALDSTRASSE → reichwaldstrasse
          όσος → ΌΣΟΣ
          ΌΣΟΣ → όσοσ

          should REICHWALDSTRASSE match documents with Reichwaldstraße or reichwaldstrasse? ΌΣΟΣ to όσος or όσοσ? etc. You really need proper unicode case folding for these.

          That being said, very very needed feature, maybe even in a degraded ascii-only version.

          Show
          Arkadiy Kukarkin
          added a comment - The lowercase index suggestion (as well as the current lowercase field workaround) don't really work outside ascii. Given that Reichwaldstraße → reichwaldstrasse REICHWALDSTRASSE → reichwaldstrasse όσος → ΌΣΟΣ ΌΣΟΣ → όσοσ should REICHWALDSTRASSE match documents with Reichwaldstraße or reichwaldstrasse? ΌΣΟΣ to όσος or όσοσ? etc. You really need proper unicode case folding for these. That being said, very very needed feature, maybe even in a degraded ascii-only version.
          Hide
          Phil Idem
          added a comment - - edited

          If this feature and Full-Text Searching (SERVER-380) were available then I could see how MongoDB could replace other search technologies such as SOLR, Lucene, and Sphinx.

          @Clifford, good suggestion. However, I think the query engine might need a special syntax when using find so that caller can clearly state that they are doing a case-insensitive search. I doubt you want to prevent the ability to do a case-sensitive search just because you have a case-insensitive index.

          Postgres has a similar feature:
          http://www.postgresql.org/docs/9.2/static/indexes-expressional.html

          Similar to Postgres, I think it should be possible to use the expression without the index (the index should be used to improve performance).

          Maybe something like this could be used:

          // register a server-side function named "lower" that can be used to convert a field to lower case
          db.system.js.save({
              _id : "lower",
              value : function(fieldName) {
                  return this[fieldName].toLowerCase();
              }});
          
          // index the lower case value of firstName
          db.users.ensureIndex( { "lower('firstName')": 1 } )
          
          // find all users whose lower case first name starts with "john" (would match "Johnny", "John", "john", etc.)
          db.users.find({
              "lower('firstName')" : {
                  $regex : /^john/
              }
          });
          
          Show
          Phil Idem
          added a comment - - edited If this feature and Full-Text Searching ( SERVER-380 ) were available then I could see how MongoDB could replace other search technologies such as SOLR, Lucene, and Sphinx. @Clifford, good suggestion. However, I think the query engine might need a special syntax when using find so that caller can clearly state that they are doing a case-insensitive search. I doubt you want to prevent the ability to do a case-sensitive search just because you have a case-insensitive index. Postgres has a similar feature: http://www.postgresql.org/docs/9.2/static/indexes-expressional.html Similar to Postgres, I think it should be possible to use the expression without the index (the index should be used to improve performance). Maybe something like this could be used: // register a server-side function named "lower" that can be used to convert a field to lower case db.system.js.save({ _id : "lower", value : function(fieldName) { return this[fieldName].toLowerCase(); }}); // index the lower case value of firstName db.users.ensureIndex( { "lower('firstName')": 1 } ) // find all users whose lower case first name starts with "john" (would match "Johnny", "John", "john", etc.) db.users.find({ "lower('firstName')" : { $regex : /^john/ } });
          Hide
          Joel Sanderson
          added a comment -

          @Clifford - I also like your concept of using a function to compute the index value. I'd love to see this functionality in MongoDB!

          Show
          Joel Sanderson
          added a comment - @Clifford - I also like your concept of using a function to compute the index value. I'd love to see this functionality in MongoDB!
          Hide
          Tim Hawkins
          added a comment -

          @arkady
          Unicode folding is required anyway for proper multilingual text search.

          Show
          Tim Hawkins
          added a comment - @arkady Unicode folding is required anyway for proper multilingual text search.
          Hide
          Ovidiu Anicai
          added a comment - - edited

          You can duplicate the column with all lowercase (maybe also replace characters with accents or special chars) then you do the search with a lowercase string.
          If you do this you'll have to keep the consistency on edits

          Show
          Ovidiu Anicai
          added a comment - - edited You can duplicate the column with all lowercase (maybe also replace characters with accents or special chars) then you do the search with a lowercase string. If you do this you'll have to keep the consistency on edits

            People

            • Votes:
              246 Vote for this issue
              Watchers:
              164 Start watching this issue

              Dates

              • Created:
                Updated:
                Days since reply:
                19 weeks, 1 day ago
                Date of 1st Reply: