Text search matcher can incorrectly match documents if multi-language or contain text-indexed nested arrays

XMLWordPrintableJSON

    • Type: Bug
    • Resolution: Done
    • Priority: Critical - P2
    • None
    • Affects Version/s: 2.5.4
    • Component/s: Text Search
    • None
    • ALL
    • None
    • 0
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      textIndexVersion:2 indexes observe subdocument language annotations (and correctly index nested arrays, if not directly nested). The text search matcher needs to invoke the correct language stemmer on text contained in subdocuments (and examine fields in nested arrays for determining a match), but doesn't.

      Reproduce with:

      > db.foo.ensureIndex({"a.b":"text"})
      > db.foo.insert({a:[{b:["example content"]}]}) // note indexed nested arrays
      Insert WriteResult({ "ok" : 1, "n" : 1 })
      > db.foo.find({$text:{$search:"example content"}}) // correct
      { "_id" : ObjectId("52aa57a0ae39c4212eb00625"), "a" : [ { "b" : [ "example content" ] } ] }
      > db.foo.find({$text:{$search:"example -content"}}) // incorrect: should return empty result set
      { "_id" : ObjectId("52aa57a0ae39c4212eb00625"), "a" : [ { "b" : [ "example content" ] } ] }
      > db.foo.find({$text:{$search:"example \"content\""}}) // incorrect: should not return empty result set
      >
      

            Assignee:
            J Rassi (Inactive)
            Reporter:
            J Rassi (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated:
              Resolved: