Details
-
Bug
-
Resolution: Done
-
Major - P3
-
2.4.0-rc1
-
None
-
ALL
Description
The stopword list generation process does not perform stemming. However, FTSSpec::_scoreString stems words before checking the stopword list:
--- fts_spec.cpp ---
|
215 makeLower( &term );
|
216 term = tools.stemmer->stem( term );
|
217 if ( tools.stopwords->isStopWord( term ) )
|
This will result in index entries being generated for any stopword for which stem(stopword) != stopword.
Note that FTSQuery::_addTerm calls isStopWord before calling stem (so you'll never see a stopword in queryDebugString):
--- fts_query.cpp ---
|
99 string word = tolowerString( term );
|
100 if ( sw->isStopWord( word ) )
|
101 return;
|
102 word = stemmer.stem( word );
|
Reproduce with:
> db.foo.ensureIndex({quote:"text"})
|
> db.foo.insert({quote:"any"})
|
> db.foo.validate().keysPerIndex
|
{ "test.foo.$_id_" : 1, "test.foo.$quote_text" : 1 }
|
> db.foo.runCommand("text",{search:"any"}).results.length
|
0
|
> db.foo.runCommand("text",{search:"ani",language:"none"}).results.length
|
1
|
Credit to kay.kim@10gen.com for original repro.