[SERVER-17643] Text indexes should work with standard indexes (instead of slowing them down) Created: 18/Mar/15 Updated: 18/Mar/15 Resolved: 18/Mar/15 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Querying, Text Search |
| Affects Version/s: | 2.6.8, 3.0.0 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Charlie Stigler | Assignee: | David Storch |
| Resolution: | Done | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||
| Operating System: | ALL | ||||||||
| Steps To Reproduce: | 1. Add a compound text index and standard ObjectId index into a collection with 1 million objects. |
||||||||
| Participants: | |||||||||
| Description |
|
Currently, text indexes seem to interact very poorly with other indexes. In many cases with low result counts, adding a normally-indexed field will actually slow the search down significantly (which is counterintuitive). Example in our app: db.items.find({ indexedField: ObjectId('523ec53087d4200202000006') }) db.items.find({ compoundTextIndex: 'grade' }) db.items.find({ indexedField: ObjectId('523ec53087d4200202000006'), compoundTextIndex: 'grade' }) This third example clearly doesn't make sense. It would be faster in this case to perform the indexed query and then filter down the results by text at the application level! It should be using the indexedField index instead of compoundTextIndex. This actually crashes our app sometimes, taking over 30 seconds in real-life queries. So it's pretty major for us. |
| Comments |
| Comment by David Storch [ 18/Mar/15 ] | ||||||||||||||||||||||||||||||||||||||
|
Glad I could help. Please watch SERVER-17648 for further updates. | ||||||||||||||||||||||||||||||||||||||
| Comment by Charlie Stigler [ 18/Mar/15 ] | ||||||||||||||||||||||||||||||||||||||
|
Dave, thanks so much, SERVER-17648 is exactly what I was trying to report with this ticket. Sorry I made it tough to understand! | ||||||||||||||||||||||||||||||||||||||
| Comment by David Storch [ 18/Mar/15 ] | ||||||||||||||||||||||||||||||||||||||
|
Hi cstigler, A query using the $text operator will always execute by looking up matching entries in the collection's text index. The system will never use a non-text index in order to perform a full text search query. The query
will therefore execute by looking up all occurrences of the term "grade", fetching the associated documents, and filtering out the documents with the correct ObjectId for indexedField. This is why adding the $text predicate to the query slows down execution dramatically compared to the apparently similar query
The latter query can just look up a small number of values of the ObjectId in the indexedField index. I have filed a feature request on your behalf to allow $text queries to use non-text indices: see SERVER-17648. For further information on MongoDB's full text search mechanism, please refer to the docs. Best, | ||||||||||||||||||||||||||||||||||||||
| Comment by Charlie Stigler [ 18/Mar/15 ] | ||||||||||||||||||||||||||||||||||||||
|
@Jason Sorry about that. There's no simplifications except for changing the property names, which I did a bit overaggressively in that case. Actual queries were:
Besides the property name indexedField being changed, this is the exact query with no simplification. This is the mongod slow query log:
| ||||||||||||||||||||||||||||||||||||||
| Comment by J Rassi [ 18/Mar/15 ] | ||||||||||||||||||||||||||||||||||||||
|
I'd like to clarify exactly what query you're asking about performance for, as I don't see any index defined on the field "compoundTextIndex" from the getIndexes() output provided. I imagine you're trying to provide a simplified view into your application, though now I'm having trouble understanding exactly what the problematic query looks like. It would be helpful to see an actual unmodified getIndexes() output, along with an actual log excerpt from the mongod slow query log showing the queries in question being run. Could you post these, please? | ||||||||||||||||||||||||||||||||||||||
| Comment by Charlie Stigler [ 18/Mar/15 ] | ||||||||||||||||||||||||||||||||||||||
|
David, thanks so much for the fast reply. Here's the output of db.items.getIndexes(). Note that there were 9 other non-text indexes shown, all standard or compound non-text indexes that aren't relevant to the query I'm testing with. I removed those but am happy to list them if they might be relevant.
| ||||||||||||||||||||||||||||||||||||||
| Comment by David Storch [ 18/Mar/15 ] | ||||||||||||||||||||||||||||||||||||||
|
Hi cstigler, Thanks for reporting this issue. We will need to gather a bit more information in order to diagnose. Could you please provide the output from running the following:
Best, |