[SERVER-8403] Review Hungarian stop word list Created: 30/Jan/13 Updated: 11/Jul/16 Resolved: 26/Feb/13 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Text Search |
| Affects Version/s: | 2.3.2 |
| Fix Version/s: | 2.4.0-rc2 |
| Type: | Task | Priority: | Major - P3 |
| Reporter: | Daniel Pasette (Inactive) | Assignee: | Paul Pedersen |
| Resolution: | Done | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Backwards Compatibility: | Fully Compatible |
| Participants: |
| Description |
|
https://github.com/mongodb/mongo/blob/master/src/mongo/db/fts/stop_words_hungarian.txt |
| Comments |
| Comment by Paul Pedersen [ 11/Feb/13 ] |
|
Snowball includes a third-party Hungarian stemmer of unknown quality. I found three Hungarian stop word lists: (1) Snowball, (2) http://www.ranks.nl/stopwords/hungarian.html, (3) http://members.unine.ch/jacques.savoy/clef/hungarianST.txt. Lists (1) == (2). List (3) is substantially longer (737 v. 35 items). Clipping the list into Google translate shows a reasonable-looking list. I think we can use the longer list. |