[DOCS-3619] full-text search "none" documentation wrong about tokenization Created: 11/Jun/14  Updated: 25/Jun/15  Resolved: 24/Jun/14

Status: Closed
Project: Documentation
Component/s: manual
Affects Version/s: None
Fix Version/s: v1.3.7

Type: Improvement Priority: Major - P3
Reporter: Pascal S. de Kloe Assignee: Kay Kim (Inactive)
Resolution: Done Votes: 0
Labels: stemming, stopwords, tokenizer
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Participants:
Days since reply: 9 years, 34 weeks, 1 day ago

 Description   

Stemming and stop word filters can corrupt search results in many cases. The two analyzers appear linked to the language setting. Language value "none" disables text analysis completely. It would be nice to have "plain" with just a generic tokenizer.



 Comments   
Comment by Githook User [ 24/Jun/14 ]

Author:

{u'username': u'kay-kim', u'name': u'kay', u'email': u'kay.kim@10gen.com'}

Message: DOCS-3619 fix none description for text language
Branch: master
https://github.com/mongodb/docs/commit/35c5fd0947c8084e37d96a292cf3015c99b13fb1

Comment by Eliot Horowitz (Inactive) [ 18/Jun/14 ]

Docs are wrong, moved this to a docs ticket to be fixed.

Comment by Pascal S. de Kloe [ 18/Jun/14 ]

You are absolutely right Eliot.

The documentation states that "none" disables the tokenizer. Is it my poor English skills or does the following need adjustment?

"If you specify a language value of "none", then the text search has no list of stop words, and the text search does not stem or tokenize the search terms."
http://docs.mongodb.org/manual/reference/text-search-languages/#text-search-languages

Comment by Eliot Horowitz (Inactive) [ 18/Jun/14 ]

If you use none, you should get simple tokenization with no stemming or stop words.
Please let us know if for some reason that doesn't do what you want.

Generated at Thu Feb 08 07:46:08 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.