Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-8423

Text search case folding needs utf-8 support

    • Type: Icon: Improvement Improvement
    • Resolution: Done
    • Priority: Icon: Major - P3 Major - P3
    • 3.1.7
    • Affects Version/s: 2.3.2
    • Component/s: Text Search
    • Labels:
      None
    • Major Change
    • Platform 6 07/17/15, Platform 8 08/28/15, Platform 7 08/10/15

      e.g. for Russian queries, "Как" currently lowercases to itself, whereas it should lowercase to "как".

      Needed for stopword removal, matching, etc.

      > db.foo.insert({content:"Как дела?"})
      > db.foo.ensureIndex({content:"text"},{default_language:"russian"})
      > db.foo.runCommand("text",{search:"\"как дела\""})
      {
      	"queryDebugString" : "дел||||как дела||",
      	"language" : "russian",
      	"results" : [ ],
      	"stats" : {
      		"nscanned" : 0,
      		"nscannedObjects" : 0,
      		"n" : 0,
      		"nfound" : 0,
      		"timeMicros" : 104
      	},
      	"ok" : 1
      }
      > db.foo.runCommand("text",{search:"\"Как дела\""})
      {
      	"queryDebugString" : "Как|дел||||Как дела||",
      	"language" : "russian",
      	"results" : [
      		{
      			"score" : 1,
      			"obj" : {
      				"_id" : ObjectId("510aa82ddb47733460b47eff"),
      				"content" : "Как дела?"
      			}
      		}
      	],
      	"stats" : {
      		"nscanned" : 1,
      		"nscannedObjects" : 0,
      		"n" : 1,
      		"nfound" : 1,
      		"timeMicros" : 118
      	},
      	"ok" : 1
      }
      > 
      

            Votes:
            18 Vote for this issue
            Watchers:
            24 Start watching this issue

              Created:
              Updated:
              Resolved: