[SERVER-24683] Text search ignores some phrases Created: 21/Jun/16 Updated: 14/Jul/16 Resolved: 21/Jun/16 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Text Search |
| Affects Version/s: | 3.0.9, 3.2.6 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | David Hobbs | Assignee: | Kelsey Schubert |
| Resolution: | Duplicate | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
|||||||||||||||||
| Operating System: | ALL | |||||||||||||||||
| Steps To Reproduce: |
|
|||||||||||||||||
| Participants: | ||||||||||||||||||
| Description |
|
When using text search, it seems that some phrases can be ignored. I am using phrase searches to perform a logical AND on several words, for example:
...to find results containing BOTH the words "cat" AND "publications". However, we found that with this example the phrase "cat" seems to be ignored and the search seems to just return results matching "publications". See the repro steps for a simple example to demonstrate the issue. |
| Comments |
| Comment by David Hobbs [ 21/Jun/16 ] | ||
|
Thanks for the explanation. I had guessed that it might be something to do with the fact that "cat" is a substring of "publications", but I also knew that the text search is not supposed to match substrings. I have voted for SERVER-20307. Thanks very much for looking into this, it's very helpful for us to know why it's happening. | ||
| Comment by Kelsey Schubert [ 21/Jun/16 ] | ||
|
The work required to modify this behavior is tracked in SERVER-20307. Since this issue is manifesting in a slightly different way, I will explain a bit more about what is going on. The word "cat" is a substring of "publications", so any document that contains the word "publications" is guaranteed to satisfy the phrase matcher's check that the Words field contains the string "cat". As you have likely observed, searching for "car" and "publications" works as expected
I hope this helps clarify this behavior. Please feel free to vote for SERVER-20307 and watch it for updates. Kind regards, |