[SERVER-8334] German stop word list contains non-stopwords Created: 25/Jan/13  Updated: 11/Jul/16  Resolved: 26/Feb/13

Status: Closed
Project: Core Server
Component/s: Text Search
Affects Version/s: None
Fix Version/s: 2.4.0-rc2

Type: Bug Priority: Minor - P4
Reporter: Marian Steinbach Assignee: Thomas Rueckstiess
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Operating System: ALL
Participants:

 Description   

The current version of the stop words

https://github.com/mongodb/mongo/blob/master/src/mongo/db/fts/stop_words_german.txt

contains nounds which shouldn't be considered stop words. Examples:

"nutzung" (usage)
"schreiben" (letter / writing)
"arbeiten" (works), "mann" (man)
"ehe" (marriage)
"frau" (woman)
"bedarf" (need)
"ende" (end)
"fall" (case)



 Comments   
Comment by auto [ 01/Feb/13 ]

Author:

{u'date': u'2013-01-25T15:04:56Z', u'email': u'marian@sendung.de', u'name': u'Marian Steinbach'}

Message: First cleanup according to SERVER-8334

Removed terms which shouldnt be in a default sopword list, e.g. because
they are (at least in some cases) nouns

Signed-off-by: Dan Pasette <dan@10gen.com>
Branch: master
https://github.com/mongodb/mongo/commit/31ccc2972f011574181e72d0119873b5f81ebe6e

Comment by Marian Steinbach [ 30/Jan/13 ]

No problem, done.

Comment by Daniel Pasette (Inactive) [ 30/Jan/13 ]

Thanks for you pull request Marian. We are going to take a closer look at the stop word lists in all languages before 2.4.0. In order for us to potentially incorporate your pull request, we'll need you to read and sign the Contributor Agreement at http://www.10gen.com/contributor first. Thanks!

Comment by Marian Steinbach [ 25/Jan/13 ]

I added a pull request removing some nouns from the list.

https://github.com/mongodb/mongo/pull/361

Generated at Thu Feb 08 03:17:09 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.