[SERVER-9537] Full text search in Dutch does incorrect stemming for words that end with "sen" Created: 02/May/13  Updated: 11/Jan/15  Resolved: 11/Jan/15

Status: Closed
Project: Core Server
Component/s: Text Search
Affects Version/s: 2.4.3
Fix Version/s: None

Type: Bug Priority: Minor - P4
Reporter: Matti Roloux Assignee: Paul Pedersen
Resolution: Duplicate Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: File stemming.js    
Issue Links:
Related
related to SERVER-9953 Text search: dutch stemmer not working? Backlog
Operating System: ALL
Steps To Reproduce:
  • create a Dutch-language full text index
  • put a document with the word "dansen" in the full text index
  • do a full text search for "dans"
    => no results are returned
Participants:

 Description   

Words in Dutch that end with "sen" are correctly stemmed to the same word without "en". So "dansen" becomes "dans". However, if you then search for "dans", this will incorrectly be stemmed to "dan", and the full text search returns no matches.

A possible solution would be to recursively stem words during the indexing and search phase, so that both "dansen" and "dans" would stem to "dan".



 Comments   
Comment by Matt Kangas [ 11/Jan/15 ]

Closing as duplicate of SERVER-9953. The discussion on that ticket more meaningfully describes the problem and path to resolution.

Generated at Thu Feb 08 03:20:43 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.