ISSUE DESCRIPTION AND IMPACT
Queries that use the Slot Based Execution (SBE) engine can experience long query optimization times (this ticket) and/or have non-optimal plans selected (SERVER-83196).
This happens because SBE is designed for fast execution of the winning plan, instead of efficient round-robin execution of candidate plans during planning. Ultimately, SBE planning time can be proportional to the longest plan, instead of the shortest plan.
Because of this inefficiency, the SBE planner also has less time available to gather information about the candidate plans, which can lead to a worse decision when the planning period ends.
More technical details can be found in the this README.
This issue has been fixed in MongoDB 8.0.0, which avoids using the SBE planner for multiplanning: instead the server always uses the Classic Engine for multiplanning, even when SBE is used to execute the winning plan.
DIAGNOSIS AND AFFECTED VERSIONS
The issue is present in MongoDB 6.0, 7.0 and 7.3. It is fixed in MongoDB 8.0.0.
A query affected by this bug will use SBE and will spend a lot of time planning. Both symptoms are visible in "Slow query" log lines:
- "queryFramework":"sbe" means the query is using SBE.
- "planningTimeMicros" shows how much time was spent planning the query.
When looking at an explain plan, the presence of a "slotBasedPlan" field means the query uses SBE.
WORKAROUNDS
As an immediate workaround on MongoDB versions affected by the bug, users can disable SBE by setting the internalQueryFrameworkControl parameter to “forceClassicEngine”. Since SBE often outperforms the Classic Engine, this option may affect the performance of queries which formerly executed in SBE.
Another workaround is to hint the affected queries, using the hint() method, or using an index filter. An index filter allows you to hint a specific query without changing the application, but it only exists for the duration of the server process and does not persist after shutdown. Note that we plan to deprecate index filters starting in version 8.0, in favor of Persistent Query Settings (SERVER-17625).
Original description:
Currently, the strategy used in SBE multiplanning is as follows:
- We run non blocking plans before blocking ones.
- We run each plan’s trial period to completion (i.e. until we return 101 documents or we use up the plans budget). We use the number of reads performed by said plan to bound the number of reads used by any remaining plans.
The problem with this approach is that if the first plan we run is not the optimal one, we are stuck running it and we can potentially use all of the reads. As an example, consider two plans, A and B. Plan A needs to perform 10k storage engine reads to get 101 documents, while plan B needs to perform 101 reads to get 101 documents. If Plan B runs first, we have no problems: we will set the reads limit for plan A to 101, and it will stop running after 101 reads. If Plan A runs first however, we will be stuck running plan A for all 10k reads. Though we’ll eventually run plan B and it will be chosen, this negatively impacts the performance of queries which need to use the multiplanner.
- fixes
-
SERVER-83196 SBE multiplanning may chooses the wrong plan
- Closed
- is duplicated by
-
SERVER-82549 MongoDB 7.0.2 SBE query slow when direct index exists
- Closed
- is fixed by
-
SERVER-83344 Enable Classic multi-planner for SBE feature flag
- Closed
- related to
-
SERVER-62981 Make SBE multi-planner's trial period termination condition independent of collection size
- Closed
-
SERVER-63102 Make separate internalQueryPlanEvaluationWorks knobs for the classic and SBE multi-planners
- Closed
-
SERVER-63642 Add serverStatus metrics to measure multi-planning performance
- Closed
-
SERVER-82677 Deduplicate index scan + fetch plans guaranteed to have similar performance
- Closed
-
SERVER-63641 Improve SBE multi-planning by choosing which plan to work next based on a priority metric
- Closed
- links to