[SERVER-14879] Text search alias for Norwegian should be changed from "nb" to "no" Created: 13/Aug/14  Updated: 28/Dec/23

Status: Backlog
Project: Core Server
Component/s: Text Search
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: J Rassi Assignee: Backlog - Query Integration
Resolution: Unresolved Votes: 0
Labels: qi-text-search, query-44-grooming
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
related to DOCS-3880 Norwegian language code is "nb" rathe... Closed
Assigned Teams:
Query Integration
Operating System: ALL
Participants:

 Description   

Quoting from comment in DOCS-3880:

I see that the description of the SERVER ticket to implement two-letter language codes (SERVER-9932) did specify "no" as the language alias to be used for Norwegian. However, the implementation registered the language as "nb" (and using "nb" in the server does correctly invoke the Norwegian stemmer and stopword list). paul@10gen.com may have raised the suggestion to use "nb" instead (based on Bokmål's current dominance) but I can't honestly recall off the top of my head. Looking into it now, I do think that "no" would have been more correct: Porter said on gmame.comp.search.snowball back in 2001 that "the simple Norwegian stemmer I've presented works equally on bokmal and nynorsk", and also the Norwegian stopword list packaged with the server has Bokmål and Nynorsk words (compare the annotated list with the list from 2.6.4).

Note that this likely requires bumping the text index version number.


Generated at Thu Feb 08 03:36:15 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.