[SERVER-10057] English stop words prevent searching for Italian country code Created: 28/Jun/13  Updated: 16/Oct/21  Resolved: 28/Jun/13

Status: Closed
Project: Core Server
Component/s: Text Search
Affects Version/s: 2.4.4
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Chris Griego Assignee: Unassigned
Resolution: Duplicate Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Duplicate
duplicates SERVER-10062 Add user configurable stop word lists... Backlog
Operating System: ALL
Participants:

 Description   

The English stop word list includes "it" which prevents someone for searching documents using the Italian country code, "IT". It looks like the original source for the current stop word list includes "us" as a stop word with a warning about care being taken due to it also being a country code and that "us" is omitted from the MongoDB stop word list. I would think that "it" would be omitted from the MongoDB list by that same logic.



 Comments   
Comment by Eliot Horowitz (Inactive) [ 01/Jul/13 ]

Or you can use "none" for the langauge for now to avoid any stop words.

Comment by Alex Reeves [ 01/Jul/13 ]

and - Andorra Spain

Comment by Daniel Pasette (Inactive) [ 28/Jun/13 ]

duplicate of SERVER-10062

Comment by Daniel Pasette (Inactive) [ 28/Jun/13 ]

Hi Alex, when using stop words at all, it's always a tradeoff between index size and lookup speed and the potential to exclude information from search results. I think the eventual solution to your issue will be the ability to customize the stop word list itself rather than trying to find the perfect universal combination.

Comment by Alex Reeves [ 28/Jun/13 ]

It looks like the following would also be impacted as well by the stop word list.

as - American Samoa
be - Belgium
in - Indiana
it - Italy
me - Maine
nor - Norway
on - Ontario
or - Oregon

Generated at Thu Feb 08 03:22:09 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.