[SERVER-36393] Contained $or pushdown optimization can cause internalQueryEnumerationMaxOrSolutions to be exceeded Created: 01/Aug/18  Updated: 17/May/23

Status: Backlog
Project: Core Server
Component/s: Querying
Affects Version/s: 3.6.6, 4.0.0
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: David Storch Assignee: Backlog - Query Optimization
Resolution: Unresolved Votes: 5
Labels: qopt-team, query-44-grooming, storch
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
Duplicate
is duplicated by SERVER-38021 【query plans issue】inappropriate cand... Closed
Related
related to SERVER-50291 Add query knob to enumerate $or child... Closed
is related to SERVER-13732 Predicates in top-level implicit AND ... Closed
is related to SERVER-47826 Server does not appropriately respect... Closed
Assigned Teams:
Query Optimization
Operating System: ALL
Steps To Reproduce:

(function() {
    "use strict";
 
    db.c.drop();
    db.c.createIndex({a: 1, c: 1});
    db.c.createIndex({b: 1, c: 1});
    db.c.createIndex({c: 1, b: 1});
 
    let explain = db.c.find({a: 1, $or: [{b: 1, c: 1}, {b: 2, c: 2}, {b: 3, c: 3}, {b: 4, c: 4}]})
                      .explain("allPlansExecution");
    let allPlansExec = explain.executionStats.allPlansExecution;
 
    // The number of candidate plans here is constrained by
    // internalQueryEnumerationMaxOrSolutions=10.
    // There are 10 OR plans generated, and one plan which indexes just the "a:1" predicate.
    assert.eq(11, allPlansExec.length);
 
    assert.commandWorked(
        db.adminCommand({setParameter: 1, internalQueryEnumerationMaxOrSolutions: 100}));
    explain = db.c.find({a: 1, $or: [{b: 1, c: 1}, {b: 2, c: 2}, {b: 3, c: 3}, {b: 4, c: 4}]})
                  .explain("allPlansExecution");
    allPlansExec = explain.executionStats.allPlansExecution;
 
    // After increasing internalQueryEnumerationMaxOrSolutions, there are many more possible plans
    // generated. With a real data set, one of the inhibited plans may be the best one!
    //
    // Note that we still run up against the limit of 64 total plans. The size of the plan space for
    // this query is roughly 3 ^ 4 = 81.
    assert.eq(64, allPlansExec.length);
}());

Sprint: Query 2020-08-10
Participants:
Case:

 Description   

For version 3.6, SERVER-13732 added an optimization for contained $or queries which causes the query planner to generate more efficient plans. This optimization can also cause additional indexed $or plans to be considered in 3.6 which were not considered in older versions of the MongoDB server. As an example, consider the query

find({a: 1, $or: [{b: 2, c: 2}, {b: 3, c: 3}]})

Suppose that the collection has indices {a: 1, c: 1} and {b: 1, c: 1}. Prior to 3.6, the query planner would generate two plans:

  • An IXSCAN on {a: 1, c: 1} with bounds {a: [[1, 1]], c: [["MinKey", "MaxKey"]]}
  • An OR plan with two IXSCANS on {b: 1, c: 1}.

With contained $or pushdown, the predicate "a:1" can be pushed into both branches of the $or. This allows the planner to additionally consider an OR plan with two IXSCANS on {a: 1, c: 1}, one with bounds {a: [[1, 1]], c: [[2, 2]]} and the other with bounds {a: [[1, 1]], c: [[3, 3]]}.

Typically, generating these extra plans has the benefit of allowing queries to be answered with tighter index bounds. However, in order to manage the potential combinatorial explosion of indexed OR plans, the planner imposes a limit on the number of plans generated for a particular $or node in the input query. This limit is configurable at startup and at runtime via the setParameter internalQueryEnumerationMaxOrSolutions, which has a default value of 10. The additional plans considered due to $or pushdown can cause the total number of plans to exceed internalQueryEnumerationMaxOrSolutions. This can inhibit generation of the best plan, thus causing a query to select a suboptimal indexed $or plan.



 Comments   
Comment by Githook User [ 03/Nov/20 ]

Author:

{'name': 'Charlie Swanson', 'email': 'charlie.swanson@mongodb.com', 'username': 'cswanson310'}

Message: Revert "SERVER-36393 Add new opt-in $or enumeration order"

This reverts commit f95fe9a05da63905e7e8699f61d4a0b2ad67074b.
Branch: v3.6
https://github.com/mongodb/mongo/commit/d8a6a927105e9b898d85a5db16ef97f3b24d8c40

Comment by Githook User [ 30/Sep/20 ]

Author:

{'name': 'Charlie Swanson', 'email': 'charlie.swanson@mongodb.com', 'username': 'cswanson310'}

Message: SERVER-36393 Add new opt-in $or enumeration order

(cherry picked from commit 03fb28b42c1b9e36c6ebd5c6aa4b8afc29dde099)
(cherry picked from commit bffdd28e183d5b720c897b56278821e65762ccd7)
(cherry picked from commit e81be8db078eb541495dee9a47433b92f5140e18)
(cherry picked from commit a7c50d464f9582cb6e255a7818354b5e41eea9e2)
Branch: v3.6
https://github.com/mongodb/mongo/commit/f95fe9a05da63905e7e8699f61d4a0b2ad67074b

Comment by Githook User [ 30/Sep/20 ]

Author:

{'name': 'Charlie Swanson', 'email': 'charlie.swanson@mongodb.com', 'username': 'cswanson310'}

Message: SERVER-36393 Add new opt-in $or enumeration order

(cherry picked from commit 03fb28b42c1b9e36c6ebd5c6aa4b8afc29dde099)
(cherry picked from commit bffdd28e183d5b720c897b56278821e65762ccd7)
(cherry picked from commit e81be8db078eb541495dee9a47433b92f5140e18)
Branch: v4.0
https://github.com/mongodb/mongo/commit/a7c50d464f9582cb6e255a7818354b5e41eea9e2

Comment by Githook User [ 18/Sep/20 ]

Author:

{'name': 'Charlie Swanson', 'email': 'charlie.swanson@mongodb.com', 'username': 'cswanson310'}

Message: SERVER-36393 Add new opt-in $or enumeration order

(cherry picked from commit bffdd28e183d5b720c897b56278821e65762ccd7)
(cherry picked from commit e81be8db078eb541495dee9a47433b92f5140e18)
Branch: v4.2
https://github.com/mongodb/mongo/commit/03fb28b42c1b9e36c6ebd5c6aa4b8afc29dde099

Comment by Githook User [ 11/Sep/20 ]

Author:

{'name': 'Charlie Swanson', 'email': 'charlie.swanson@mongodb.com', 'username': 'cswanson310'}

Message: SERVER-36393 Add new opt-in $or enumeration order

(cherry picked from commit bffdd28e183d5b720c897b56278821e65762ccd7)
Branch: v4.4
https://github.com/mongodb/mongo/commit/e81be8db078eb541495dee9a47433b92f5140e18

Comment by Githook User [ 02/Sep/20 ]

Accidentally committed under the wrong ticket. The following commit is related, but this issue remains unresolved:

Author:

Unknown macro: {'name'}

Message: SERVER-36393 Add new opt-in $or enumeration order
Branch: master
https://github.com/mongodb/mongo/commit/bffdd28e183d5b720c897b56278821e65762ccd7

Comment by Charlie Swanson [ 13/Aug/20 ]

I decided to split out the work into a new ticket: SERVER-50291. I am attempting to implement the workaround described above (and in the new ticket), but this ticket will remain an issue, even with that workaround. So I'll leave this open and return it to the backlog, but I am continuing progress on SERVER-50291. We have an implementation that works, stay tuned for updates over there!

Comment by Charlie Swanson [ 05/Aug/20 ]

Since it's been a little while since I started looking into this, I just wanted to post an update here:

The patch is looking a little rough and messy at this point, but I think it's possible to implement the workaround described in Dave's comment. It'll need some serious polish, review, and testing, but so far it's pretty nicely contained behind a serverParameter, so the risk to existing workloads is minimal.

I recently encountered some difficulty around determining whether we've reached the end of plan enumeration. The approach so far is to first list the sub-plans in "lockstep iteration order" (or whatever you want to call that) as described above, then once those are exhausted go back to the "normal" or "traditional" iteration order. In this second "phase" you'll still occasionally encounter one of those "lockstep" plans, and will have to skip it. We run into some trouble when we have to skip one like the above, and it's also the last plan to enumerate. The internal code and API doesn't really allow you to easily detect and communicate this. It's a tractable problem, but that's what remains to look into. I'm pausing on this investigation for a little while to get some other small test fixes pushed and reviews done, but I hope to resume within the week.

Comment by David Storch [ 10/Aug/18 ]

Asya points out that our plan enumeration code might fall over like this less if it changes its enumeration strategy from something like:

  • A, A, A
  • A, A, B
  • A, A, C
  • ...

To something like

  • A, A, A
  • B, B, B
  • C, C, C
  • ...

This is a heuristic that might help us to force the plan selection code to consider the best index in more cases, short of a more complete solution.

Generated at Thu Feb 08 04:42:58 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.