[SERVER-28889] Partial index shouldn't do fetch on conditions that are true by the definition of the index Created: 20/Apr/17 Updated: 09/Jan/24 Resolved: 28/Apr/23 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Querying |
| Affects Version/s: | None |
| Fix Version/s: | 7.1.0-rc0, 7.0.0-rc5 |
| Type: | Improvement | Priority: | Major - P3 |
| Reporter: | Alexander Komyagin | Assignee: | Milena Ivanova |
| Resolution: | Fixed | Votes: | 13 |
| Labels: | asya, storch | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Assigned Teams: |
Query Optimization
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Backport Requested: |
v7.0, v6.3, v6.0, v5.0, v4.4
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Sprint: | QO 2023-03-06, QO 2023-03-20, QO 2023-04-03, QO 2023-04-17, QO 2023-05-01 | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Participants: | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Case: | (copied to CRM) | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Description |
There's no need to do the FETCH with a check on bin:true since that condition is already satisfied by the index partial filterexpression |
| Comments |
| Comment by Githook User [ 16/Jun/23 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Author: {'name': 'Milena Ivanova', 'email': 'milena.ivanova@mongodb.com', 'username': 'mivanova3'}Message: (cherry picked from commit c4e7c74c4421f42a3bb23f4247a0d2721384d38d) | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Githook User [ 27/Apr/23 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Author: {'name': 'Milena Ivanova', 'email': 'milena.ivanova@mongodb.com', 'username': 'mivanova3'}Message: | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Linus Unnebäck [ 31/Oct/22 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
I believe that I'm running into this issue. We have an index defined as follows: } We then do the following aggregation: }, , And the plan looks like this: , }, , A temporary workaround is to remove the `status: 'paid'` match condition, and specify an explicit index with the `hint` parameter. But this seems a bit scary 😅 | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Ralf Strobel [ 17/Jul/22 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Ah, sorry, I copied the original example without thinking. I can confirm this does work in 4.4...
Still, likely nothing we would want to do in practice. But I will test it out in 5.0 eventually, where our null use-case should be handled more favorably. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Asya Kamsky [ 17/Jul/22 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
That’s doing a FETCH because you’re returning the full document. Note that fetch is not applying a filter to the FETCH. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Ralf Strobel [ 17/Jul/22 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Thank you for the insights Asya. I will have to try this once we are migrated to 5.0. Speaking for 4.4, I cannot confirm the mentioned workaround is effective either...
In our use case, we are also explicitly trying not to include the deleted field in all of our indexes, so I don't find this workaround very satisfying even if it does work. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Asya Kamsky [ 15/Jul/22 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
The issue described in this ticket has a workaround to avoid FETCH in scenarios where partial filter condition is something other than $exists - the workaround does NOT apply to $exists filter. Workaround: include the field(s) in partialFilterExpression in the index definition. So for example given in the description, the index definition would become x:1, bin:1 and that would allow document FETCH to be avoided. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Asya Kamsky [ 15/Jul/22 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
ralf.strobel the issue you are observing is not related to partial indexes, but rather null semantics and COUNT and the example you show is specifically fixed in 5.0 by
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Ralf Strobel [ 07/Jul/22 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Hi, just trying to raise awareness for this issue again, as we are facing more and more problems due to it. We're currently trying to gather regular metrics about our collection sizes, which could be trivially answered by the database if this was implemented. Due to this issue, we instead end up doing expensive index scans and flooding the cache with unnecessary documents, lowering our overall database performance. EDIT: Interestingly, a compound index also seems to suffer from a similar issue when filtering by null...
whereas
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Jacob Botuck [ 30/Sep/20 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Does this issue also affect deletes? I want to delete all records in the index. I do delete where foo: $exists=true AND foo.bar > MinKey. It is going really slow! | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Ralf Strobel [ 26/Mar/20 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
We would greatly appreciate the implementation of this improvement. For us and probably many others, this is not a fringe case optimization, but would affect literally every database query... Consider a common soft-delete mechanism, implemented via a boolean flag field:
In this environment, every query will contain a respective filter on non-deleted objects:
As pointed out by some of the duplicate issues, especially the count and distinct queries will currently perform at sub-optimal performance, as they cannot use the COUNT_SCAN or DISTINCT_SCAN stages directly on the index, but need to fetch the full object data instead. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Eliott Coyac [ 09/Aug/19 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
As of now, we can't count the number of elements indexed with a partialFilterExpression on another field without doing a FETCH. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Kyle Suarez [ 20/Apr/17 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Looks like we still do the FETCH even when there's an explicit projection:
|