[SERVER-29598] Support Korean language in full text search Created: 13/Jun/17 Updated: 27/Dec/23 |
|
| Status: | Backlog |
| Project: | Core Server |
| Component/s: | Text Search |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | New Feature | Priority: | Major - P3 |
| Reporter: | 아나 하리 | Assignee: | Backlog - Query Integration |
| Resolution: | Unresolved | Votes: | 6 |
| Labels: | qi-text-search | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||
| Assigned Teams: |
Query Integration
|
||||||||
| Participants: | |||||||||
| Case: | (copied to CRM) | ||||||||
| Description |
|
Add Korean to languages supported in MongoDB FTS. Original description: I am not sure you are interested in Korean,
But current MongoDB implementation, MongoDB search exact match with search term. So Korean word does not matched because of suffix("은", "는", "이", "가", "처럼", ...) So if MongoDB support range search for text search like below example, We (Korean) can use text-search for Korean language.
Of course, this feature is not needed for language which has stemming. I pushed pull-request for this simple idea to MongoDB github This feature will save a lot of Korean guys. Please consider adding this feature seriously. Thanks. |
| Comments |
| Comment by 아나 하리 [ 30/Jun/17 ] |
|
Hi Asya. >> I'm going to convert this ticket into a new feature request for MongoDB to add proper text search support for Korean language. Anyway, I hope "SERVER-15090" is implemented sooner or later. Thanks. |
| Comment by Asya Kamsky [ 28/Jun/17 ] |
|
You are correct, MongoDB text search currently does not provide support for Korean (you can see the list of currently supported languages here). The best solution would be for us to add support for Korean, which would include support for appropriate stemming and stop words. As you found, if the language is not supported, text search uses simple tokenization with no list of stop words and no stemming. Your proposed pull request tries to implement prefix text search, a new feature we are already tracking in SERVER-15090, however, we cannot accept the pull request for several reasons:
Since we already have a JIRA ticket for prefix search, I think the proposed work for that feature should be tracked there. I'm going to convert this ticket into a new feature request for MongoDB to add proper text search support for Korean language. Thanks for your interest in MongoDB. Regards, |
| Comment by Mark Agarunov [ 16/Jun/17 ] |
|
Hello matt.lee, Thank you for providing the detailed example. I've set the fixVersion on this ticket to "Needs Triage" for this new feature to be scheduled against our currently planned work. Updates will be posted on this ticket as they happen. Thanks, |