Spanish text search stemmer

XMLWordPrintableJSON

    • Type: Bug
    • Resolution: Done
    • Priority: Major - P3
    • None
    • Affects Version/s: 2.4.6
    • Component/s: Text Search
    • None
    • Environment:
      centos
    • Linux
    • None
    • 3
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      The stemmer apparently removes the 'o' at the end of each word (we have quite a few words which end in 'o' so you can see how problematic this is

      So if I run this query: db.collection.runCommand( "text",

      { search: "barco", language:"spanish" }

      )

      I get the following output, and no results even though there's a field containing the word 'barco' (notice how the 'o' has been removed in the queryDebugString field):

      {
      	"queryDebugString" : "barc||||||",
      	"language" : "spanish",
      	"results" : [ ],
      	"stats" : {
      		"nscanned" : 0,
      		"nscannedObjects" : 0,
      		"n" : 0,
      		"nfound" : 0,
      		"timeMicros" : 1208
      	},
      	"ok" : 1
      }
      

      But if I run the same query but choosing english as language: db.collection.runCommand( "text",

      { search: "barco", language:"english" }

      )

      I get a result (notice that the 'o' has not been removed this time)

      {
      	"queryDebugString" : "barco||||||",
      	"language" : "english",
      	"results" : [
      		{
      			"score" : 1.1,
      			"obj" : {
      				"_id" : ObjectId("527822523dd360464b4fd1d7"),
      ...
      }
      

      Any idea why the 'o' is being removed in spanish?

      Many thanks

            Assignee:
            Unassigned
            Reporter:
            Miguel G
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated:
              Resolved: