[SERVER-24832] Project array by text search predicate Created: 28/Jun/16 Updated: 27/Dec/23 |
|
| Status: | Backlog |
| Project: | Core Server |
| Component/s: | Querying, Text Search |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | New Feature | Priority: | Major - P3 |
| Reporter: | Cássio Martini Martins Pereira | Assignee: | Backlog - Query Integration |
| Resolution: | Unresolved | Votes: | 0 |
| Labels: | qi-text-search | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||
| Assigned Teams: |
Query Integration
|
||||||||
| Participants: | |||||||||
| Description |
|
Suppose I create the following db/collection:
I want to do text search over the arrays arr and obtain only the array elements that match the search. For instance,
Ideally I'd like to only obtain the first element of the arr in the first document: abc xyz and only the second element of the arr in the second document: ljh abc, and also the _id of the matched documents. Note that what I want is not a simple projection of the arr element, but of the elements of the arr where the match occurred. |
| Comments |
| Comment by Cássio Martini Martins Pereira [ 05/Jul/16 ] | |||||||||||
|
Thanks for the update ramon.fernandez. Will do. | |||||||||||
| Comment by Ramon Fernandez Marina [ 05/Jul/16 ] | |||||||||||
|
Thanks for the additional information. The Query team has added this to their backlog for consideration in the future. Please continue to watch the ticket for updates. | |||||||||||
| Comment by Cássio Martini Martins Pereira [ 29/Jun/16 ] | |||||||||||
|
Hi, thanks for looking into it ramon.fernandez. The problem with regexes is that you're going to have to hardcode them. For instance, suppose the following new example:
FTS can find the entries with the following queries:
How would you write a generic regular expression that handles both cases? Notice that for FTS stemming is going on on the background. Hardcoding fish* and watch* is not allowed. Imagine the input is coming from a search box for a user. | |||||||||||
| Comment by Ramon Fernandez Marina [ 29/Jun/16 ] | |||||||||||
|
I am sending this feature request to the Query team for evaluation. Note that this feature introduces new semantics for multikey indices, so even if it's accepted it may require a considerable effort. I think it would help your case if you were able to produce an example where using regular expressions is not sufficient. Thanks, | |||||||||||
| Comment by Cássio Martini Martins Pereira [ 29/Jun/16 ] | |||||||||||
|
Hi ramon.fernandez, That is a possibility. I also got a nice "trick" from a stackoverflow question I posted: http://stackoverflow.com/questions/38085710/return-only-the-array-element-on-which-a-text-search-matched
But your proposed solution and the one I got from stack overflow (listed above) use regexs. While that does solve my simple toy case, regexs do not have the same semantics of full text search in Mongo. FTS uses stop words, stemming, which are not present in regexes. If you can provide me with an example that solves the problem using the aggregation framework (or map reduce) while using FTS I'd be happy the functionality already exists. PS: Thanks for the quick response! | |||||||||||
| Comment by Ramon Fernandez Marina [ 29/Jun/16 ] | |||||||||||
|
cmmp, perhaps the aggregation framework can meet your needs:
Is this the functionality you're looking for? |