[DOCS-9550] Docs for SERVER-8423: Text search case folding needs utf-8 support Created: 05/Dec/16  Updated: 23/Feb/18  Resolved: 23/Feb/18

Status: Closed
Project: Documentation
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Task Priority: Major - P3
Reporter: Emily Hall Assignee: Steve Renaker (Inactive)
Resolution: Duplicate Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Documented
documents SERVER-8423 Text search case folding needs utf-8 ... Closed
Participants:
Days since reply: 5 years, 50 weeks, 6 days ago
Epic Link: DOCSP-1769

 Description   

Engineering Ticket Description:

e.g. for Russian queries, "Как" currently lowercases to itself, whereas it should lowercase to "как".

Needed for stopword removal, matching, etc.

> db.foo.insert({content:"Как дела?"})
> db.foo.ensureIndex({content:"text"},{default_language:"russian"})
> db.foo.runCommand("text",{search:"\"как дела\""})
{
	"queryDebugString" : "дел||||как дела||",
	"language" : "russian",
	"results" : [ ],
	"stats" : {
		"nscanned" : 0,
		"nscannedObjects" : 0,
		"n" : 0,
		"nfound" : 0,
		"timeMicros" : 104
	},
	"ok" : 1
}
> db.foo.runCommand("text",{search:"\"Как дела\""})
{
	"queryDebugString" : "Как|дел||||Как дела||",
	"language" : "russian",
	"results" : [
		{
			"score" : 1,
			"obj" : {
				"_id" : ObjectId("510aa82ddb47733460b47eff"),
				"content" : "Как дела?"
			}
		}
	],
	"stats" : {
		"nscanned" : 1,
		"nscannedObjects" : 0,
		"n" : 1,
		"nfound" : 1,
		"timeMicros" : 118
	},
	"ok" : 1
}
> 



 Comments   
Comment by Kay Kim (Inactive) [ 23/Feb/18 ]

The work was done before the ticket maker script was in place.

Generated at Thu Feb 08 07:58:38 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.