[SERVER-36393] Contained $or pushdown optimization can cause internalQueryEnumerationMaxOrSolutions to be exceeded Created: 01/Aug/18 Updated: 17/May/23 |
|
| Status: | Backlog |
| Project: | Core Server |
| Component/s: | Querying |
| Affects Version/s: | 3.6.6, 4.0.0 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | David Storch | Assignee: | Backlog - Query Optimization |
| Resolution: | Unresolved | Votes: | 5 |
| Labels: | qopt-team, query-44-grooming, storch | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||||||||||||||||||||
| Assigned Teams: |
Query Optimization
|
||||||||||||||||||||||||||||||
| Operating System: | ALL | ||||||||||||||||||||||||||||||
| Steps To Reproduce: |
|
||||||||||||||||||||||||||||||
| Sprint: | Query 2020-08-10 | ||||||||||||||||||||||||||||||
| Participants: | |||||||||||||||||||||||||||||||
| Case: | (copied to CRM) | ||||||||||||||||||||||||||||||
| Description |
|
For version 3.6,
Suppose that the collection has indices {a: 1, c: 1} and {b: 1, c: 1}. Prior to 3.6, the query planner would generate two plans:
With contained $or pushdown, the predicate "a:1" can be pushed into both branches of the $or. This allows the planner to additionally consider an OR plan with two IXSCANS on {a: 1, c: 1}, one with bounds {a: [[1, 1]], c: [[2, 2]]} and the other with bounds {a: [[1, 1]], c: [[3, 3]]}. Typically, generating these extra plans has the benefit of allowing queries to be answered with tighter index bounds. However, in order to manage the potential combinatorial explosion of indexed OR plans, the planner imposes a limit on the number of plans generated for a particular $or node in the input query. This limit is configurable at startup and at runtime via the setParameter internalQueryEnumerationMaxOrSolutions, which has a default value of 10. The additional plans considered due to $or pushdown can cause the total number of plans to exceed internalQueryEnumerationMaxOrSolutions. This can inhibit generation of the best plan, thus causing a query to select a suboptimal indexed $or plan. |
| Comments |
| Comment by Githook User [ 03/Nov/20 ] |
|
Author: {'name': 'Charlie Swanson', 'email': 'charlie.swanson@mongodb.com', 'username': 'cswanson310'}Message: Revert "SERVER-36393 Add new opt-in $or enumeration order" This reverts commit f95fe9a05da63905e7e8699f61d4a0b2ad67074b. |
| Comment by Githook User [ 30/Sep/20 ] |
|
Author: {'name': 'Charlie Swanson', 'email': 'charlie.swanson@mongodb.com', 'username': 'cswanson310'}Message: SERVER-36393 Add new opt-in $or enumeration order (cherry picked from commit 03fb28b42c1b9e36c6ebd5c6aa4b8afc29dde099) |
| Comment by Githook User [ 30/Sep/20 ] |
|
Author: {'name': 'Charlie Swanson', 'email': 'charlie.swanson@mongodb.com', 'username': 'cswanson310'}Message: SERVER-36393 Add new opt-in $or enumeration order (cherry picked from commit 03fb28b42c1b9e36c6ebd5c6aa4b8afc29dde099) |
| Comment by Githook User [ 18/Sep/20 ] |
|
Author: {'name': 'Charlie Swanson', 'email': 'charlie.swanson@mongodb.com', 'username': 'cswanson310'}Message: SERVER-36393 Add new opt-in $or enumeration order (cherry picked from commit bffdd28e183d5b720c897b56278821e65762ccd7) |
| Comment by Githook User [ 11/Sep/20 ] |
|
Author: {'name': 'Charlie Swanson', 'email': 'charlie.swanson@mongodb.com', 'username': 'cswanson310'}Message: SERVER-36393 Add new opt-in $or enumeration order (cherry picked from commit bffdd28e183d5b720c897b56278821e65762ccd7) |
| Comment by Githook User [ 02/Sep/20 ] |
|
Accidentally committed under the wrong ticket. The following commit is related, but this issue remains unresolved:
|
| Comment by Charlie Swanson [ 13/Aug/20 ] |
|
I decided to split out the work into a new ticket: |
| Comment by Charlie Swanson [ 05/Aug/20 ] |
|
Since it's been a little while since I started looking into this, I just wanted to post an update here: The patch is looking a little rough and messy at this point, but I think it's possible to implement the workaround described in Dave's comment. It'll need some serious polish, review, and testing, but so far it's pretty nicely contained behind a serverParameter, so the risk to existing workloads is minimal. I recently encountered some difficulty around determining whether we've reached the end of plan enumeration. The approach so far is to first list the sub-plans in "lockstep iteration order" (or whatever you want to call that) as described above, then once those are exhausted go back to the "normal" or "traditional" iteration order. In this second "phase" you'll still occasionally encounter one of those "lockstep" plans, and will have to skip it. We run into some trouble when we have to skip one like the above, and it's also the last plan to enumerate. The internal code and API doesn't really allow you to easily detect and communicate this. It's a tractable problem, but that's what remains to look into. I'm pausing on this investigation for a little while to get some other small test fixes pushed and reviews done, but I hope to resume within the week. |
| Comment by David Storch [ 10/Aug/18 ] |
|
Asya points out that our plan enumeration code might fall over like this less if it changes its enumeration strategy from something like:
To something like
This is a heuristic that might help us to force the plan selection code to consider the best index in more cases, short of a more complete solution. |