[SERVER-37530] Provide a way to cause a well-defined order of evaluation for predicates Created: 09/Oct/18 Updated: 06/Dec/22 |
|
| Status: | Backlog |
| Project: | Core Server |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Yuta Arai | Assignee: | Backlog - Query Optimization |
| Resolution: | Unresolved | Votes: | 1 |
| Labels: | afz, mql-semantics, query-44-grooming | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Assigned Teams: |
Query Optimization
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Operating System: | ALL | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Steps To Reproduce: |
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Participants: | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Case: | (copied to CRM) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Linked BF Score: | 17 | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Description |
|
Currently we freely reorder predicates for rewrite and optimization for the sake of performance. We could provide a way to define the order certain predicates are run in so any side effects from them (such as errors being produced) are predictable Here is a description of a situation that shows our current behavior: There is an inconsistency in how we are handling an error for a specific query involving $expr for when we use an index vs when we do a collection scan. The inconsistency stems from two behaviors of our query system: 1. A match stage can be pushed down to hide an error Example:
The aggregation above will return {num: 0}instead of throwing the divide by zero error, because the $ne stage is pushed down to index scan:
2. Collection scans can form canonicalize queries to throw error Given these two behaviors, the results for the following queries are inconsistent between an index scan and collection scan. Index scan will succeed while a collection scan will throw a divide by zero error.
The explain outputs for both of two queries will be the following:
This inconsistency also occurs for an $and expression in the same conditions. |
| Comments |
| Comment by James Wahlin [ 24/Oct/18 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Done | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Charlie Swanson [ 23/Oct/18 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
james.wahlin given the above can I suggest renaming this ticket to something like "Planning optimizations can hide error scenarios"? | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by James Wahlin [ 23/Oct/18 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
After further discussion we are agreed that this problem is broader than $expr optimization. Query and aggregation optimization allow for a reordering of operations. This reordering makes it possible that an optimized statement can match a document in a manner that short-circuits executing an additional statement that would produce an error. This allows for different behavior in optimized vs non-optimized execution and for indexes to impact behavior via related optimizations. An example of where we could hypothetically trigger this in aggregation is as follows. In the following, if we were to strictly evaluate order of operation of the pipeline stages, the $addFields would trigger a divide by 0 error. Instead, we push the $match stage into the query system, filtering out the document {x: 0} prior to applying the $addFields.
Given the above we are going to avoid attempting a quick-fix and will instead look for / consider broader solutions. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Asya Kamsky [ 19/Oct/18 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
To sum up - it's the clauses being in different order that makes it fail or succeed with and without index, here is reproducer for find:
a1 and a2 are from explain above so they differ in order of clauses in $and only:
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Asya Kamsky [ 19/Oct/18 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Here is the example I had in mind:
The difference here is the order of clauses in the $and array - so I actually think this is a bigger problem that this is just an example of (the example being that optimization makes evaluation of a particular stage happen first). | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by David Storch [ 19/Oct/18 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Assigning to james.wahlin to investigate what a fix would look like. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Asya Kamsky [ 19/Oct/18 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Just to be clear - I don't think this is agg related, this happens for complex expression in find as well if we evaluate clauses in different order. |