Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-9953

Text search: dutch stemmer not working?

    • Query Integration

      Hi all,

      I'm using MongoDB text search, and I'd like to give some feedback. I'm not sure what the best way is to do so, so I've made this report. If there's a more preferred way, please let me know, so I can use that way in the future.

      Based on this document: http://docs.mongodb.org/manual/tutorial/create-text-index-on-multi-language-collection/, I've made some testcase, and I don't understand what's happening.

      This is my test data:

      { "_id" : 1, "language" : "portuguese", "quote" : "A sorte protege os audazes" }
      { "_id" : 2, "language" : "spanish", "quote" : "Nada hay más surreal que la realidad." }
      { "_id" : 3, "language" : "english", "quote" : "is this a dagger which I see before me" }
      { "_id" : 4, "language" : "dutch", "quote" : "is dit een dolk die ik voor mij zie" }
      { "_id" : 5, "language" : "dutch", "quote" : "vol verbijstering zaten de dames naar de twee honden te kijken" }
      

      And I'm most interested in finding the Dutch results right now.

      It seems like the stemmer is not working for some words:

      > db.quotes.runCommand( "text", { search: "honden", language:"dutch" } )
      Correct result: 1 (queryDebugString: 'hond')
      > db.quotes.runCommand( "text", { search: "hond", language:"dutch" } )
      Correct result: 1 (queryDebugString: 'hond')
       db.quotes.runCommand( "text", { search: "dames", language:"dutch" } )
      Correct result: 1 (queryDebugString: 'dames')
       db.quotes.runCommand( "text", { search: "dame", language:"dutch" } )
      Incorrect result: 0 (queryDebugString: 'dam')
      

      Note that the plural for hond ('dog') is honden (dogs)
      The plural for dame ('lady') is dames (ladies)

      However, MongoDB text search doesn't seem to understand this, and returns nothing. In my opinion, this seems like a bug?

            Assignee:
            backlog-query-integration [DO NOT USE] Backlog - Query Integration
            Reporter:
            bodiam Erik Pragt
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated: