[SERVER-10057] English stop words prevent searching for Italian country code Created: 28/Jun/13 Updated: 16/Oct/21 Resolved: 28/Jun/13 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Text Search |
| Affects Version/s: | 2.4.4 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Chris Griego | Assignee: | Unassigned |
| Resolution: | Duplicate | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||
| Operating System: | ALL | ||||||||
| Participants: | |||||||||
| Description |
|
The English stop word list includes "it" which prevents someone for searching documents using the Italian country code, "IT". It looks like the original source for the current stop word list includes "us" as a stop word with a warning about care being taken due to it also being a country code and that "us" is omitted from the MongoDB stop word list. I would think that "it" would be omitted from the MongoDB list by that same logic. |
| Comments |
| Comment by Eliot Horowitz (Inactive) [ 01/Jul/13 ] |
|
Or you can use "none" for the langauge for now to avoid any stop words. |
| Comment by Alex Reeves [ 01/Jul/13 ] |
|
and - Andorra Spain |
| Comment by Daniel Pasette (Inactive) [ 28/Jun/13 ] |
|
duplicate of SERVER-10062 |
| Comment by Daniel Pasette (Inactive) [ 28/Jun/13 ] |
|
Hi Alex, when using stop words at all, it's always a tradeoff between index size and lookup speed and the potential to exclude information from search results. I think the eventual solution to your issue will be the ability to customize the stop word list itself rather than trying to find the perfect universal combination. |
| Comment by Alex Reeves [ 28/Jun/13 ] |
|
It looks like the following would also be impacted as well by the stop word list. as - American Samoa |