[SERVER-13987] Access an Index as a collection. Created: 19/May/14 Updated: 06/Dec/22 |
|
| Status: | Open |
| Project: | Core Server |
| Component/s: | Querying |
| Affects Version/s: | None |
| Fix Version/s: | features we're not sure of |
| Type: | New Feature | Priority: | Minor - P4 |
| Reporter: | John Page | Assignee: | Backlog - Query Execution |
| Resolution: | Unresolved | Votes: | 3 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Assigned Teams: |
Query Execution
|
| Participants: |
| Description |
|
I would like the ability to query an index, specifically FTS indexes but also normal ones. Whilst covered indexes resolve some use cases - I'd like the ability to run a regular expression query over an index, and also get a set of unique values back. For example - to query for all words in the FTS index matching a regex (and skipping over the index values with the same entry) so a REGEX for /.one./ on the index to get Jones,Bones,Phones and then that can become a FTS query to enable fuzzy search. |
| Comments |
| Comment by Laszlo Balogh [ 09/Sep/14 ] |
|
From the UX perspective, "Did you mean"/fuzzy search feature is an absolute must have. One way to make a user leave a website, if she can see no result on the search results page because of a spelling error. (According to Greg Nudleman's book http://www.amazon.co.uk/Designing-Search-Strategies-Ecommerce-UXmatters/dp/0470942231) We are considering to stop using MongoDB for full text search and adding ElasticSearch to our stack, just because its fuzzy search capability. Obviously this move will add extra complexity to our system, we have to store data two places, we have to deal with network latency, etc. However by having access to the terms in the inverted index, one can easily implement fuzzy search with MongoDB. |
| Comment by John Page [ 20/May/14 ] |
|
Pluggable indexing would be good for this too - although that's probably a big job. |
| Comment by John Page [ 20/May/14 ] |
|
You could alos think of it as a poor mans column store - especially if we had some way of run-lenght encoding key runs So if the first node in the tree with Value X has a count (counting indexes) and a pointer to the last node with that value we can walk the tree way faster - would coudl also !! compact the indexes by not storing the key value where it's identical, have a special (use previous) value that was only a byte long. |