Stemming and stop word deletion in phrases

XMLWordPrintableJSON

    • Type: Bug
    • Resolution: Done
    • Priority: Critical - P2
    • None
    • Affects Version/s: 2.3.2
    • Component/s: Text Search
    • None
    • Environment:
      Mac OS 10.7.5.
      MongoDB 2.3.2
    • ALL
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      Having indexed the Enron email dataset, with a two-field text index with default weightings, I'm seeing unexpected behaviour when searching for phrases. It appears terms are being stemmed and stop words removed within a phrase:

      > db.getCollection("emails").runCommand("text",

      { "search" : "\"the scrimmage\"",limit:1 }

      );

      "queryDebugString" : "scrimmag||||the scrimmage||"

      > db.getCollection("emails").runCommand("text",

      { "search" : '"the scrimmage"', limit:1 }

      );

      "queryDebugString" : "scrimmag||||the scrimmage||"

      This behaviour was first spotted by a MongoDB user at the FTS Hackathon in London.

            Assignee:
            Unassigned
            Reporter:
            Matt Bates (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated:
              Resolved: