Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-29918

stemming behavior for diacritics causes incorrect results

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major - P3
    • Resolution: Works as Designed
    • 3.4.4
    • None
    • Text Search
    • None
    • ubuntu 16.04, mongodb 3.4.4
    • ALL
    • Hide

      > db.test.insertMany([  
         { "_id":1, "name":"iphone" },
         { "_id":2, "name":"iphône" },
         { "_id":3, "name":"iphonë" },
         { "_id":4, "name":"iphônë" }
      ])
       
       
      > db.test.ensureIndex({name: "text"})
       
      > db.test.find({$text: {$search: "iphone"}})
      { "_id" : 1, "name" : "iphone" }
      { "_id" : 2, "name" : "iphône" }
       
      > db.test.find({name: "iphone"}).collation({locale: "en", strength: 1})
      { "_id" : 1, "name" : "iphone" }
      { "_id" : 2, "name" : "iphône" }
      { "_id" : 3, "name" : "iphonë" }
      { "_id" : 4, "name" : "iphônë" }
      
      

      Show
      > db.test.insertMany([ { "_id" : 1 , "name" : "iphone" }, { "_id" : 2 , "name" : "iphône" }, { "_id" : 3 , "name" : "iphonë" }, { "_id" : 4 , "name" : "iphônë" } ])     > db.test.ensureIndex({name: "text" })   > db.test.find({$text: {$search: "iphone" }}) { "_id" : 1 , "name" : "iphone" } { "_id" : 2 , "name" : "iphône" }   > db.test.find({name: "iphone" }).collation({locale: "en" , strength: 1 }) { "_id" : 1 , "name" : "iphone" } { "_id" : 2 , "name" : "iphône" } { "_id" : 3 , "name" : "iphonë" } { "_id" : 4 , "name" : "iphônë" }
    • Query 2017-07-31, Query 2017-10-02, Query 2017-10-23, Query 2017-11-13

    Description

      $text search is not diacritic insensitive if the word contains a dieresis ( ¨ ). Dieresis is categorized as diacritic in Unicode 8.0 Character Database Prop List, cf http://www.unicode.org/Public/8.0.0/ucd/PropList.txt

      Search with collation works fine with

      strength = 1
      

      Attachments

        Issue Links

          Activity

            People

              kyle.suarez@mongodb.com Kyle Suarez
              felix2626 adrien petel
              Votes:
              1 Vote for this issue
              Watchers:
              11 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: