Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-8341

Stemming and stop word deletion in phrases

    XMLWordPrintableJSON

Details

    • Icon: Bug Bug
    • Resolution: Done
    • Icon: Critical - P2 Critical - P2
    • None
    • 2.3.2
    • Text Search
    • None
    • Mac OS 10.7.5.
      MongoDB 2.3.2
    • ALL

    Description

      Having indexed the Enron email dataset, with a two-field text index with default weightings, I'm seeing unexpected behaviour when searching for phrases. It appears terms are being stemmed and stop words removed within a phrase:

      > db.getCollection("emails").runCommand("text",

      { "search" : "\"the scrimmage\"",limit:1 }

      );

      "queryDebugString" : "scrimmag||||the scrimmage||"

      > db.getCollection("emails").runCommand("text",

      { "search" : '"the scrimmage"', limit:1 }

      );

      "queryDebugString" : "scrimmag||||the scrimmage||"

      This behaviour was first spotted by a MongoDB user at the FTS Hackathon in London.

      Attachments

        Activity

          People

            Unassigned Unassigned
            matt.bates@10gen.com Matt Bates
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: