[SERVER-20616] Plan ranker sampling from the beginning of a query's execution can result in poor plan selection Created: 24/Sep/15 Updated: 30/Nov/23 |
|
| Status: | Backlog |
| Project: | Core Server |
| Component/s: | Querying |
| Affects Version/s: | 2.6.11, 3.0.6, 3.2.0 |
| Fix Version/s: | None |
| Type: | Improvement | Priority: | Major - P3 |
| Reporter: | David Storch | Assignee: | Backlog - Query Optimization |
| Resolution: | Unresolved | Votes: | 11 |
| Labels: | bonsai | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||||||||||||||||||||||
| Assigned Teams: |
Query Optimization
|
||||||||||||||||||||||||||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||||||||||||||||||||||||||
| Participants: | |||||||||||||||||||||||||||||||||
| Case: | (copied to CRM) | ||||||||||||||||||||||||||||||||
| Description |
|
The query engine currently selects a winning plan among n candidate plans by the following mechanism. We start executing each of the n plans for some short trial period. Once the trial period ends, we use stats collected from the brief trial execution to score the plans. The most highly scored plan becomes the winner. The scoring is based on a notion of "productivity", essentially the ratio of results produced per amount of work (where work is an artificial unit that is a proxy for CPU + IO time). You can think of the trial execution period as a way to sample the data on the fly, a kind of adaptive query processing technique in which inferences about the data are made while the query is executing. The problem is that this sampling strategy may not accurately represent the true data distribution. For example, suppose that some candidate plan p1 is scanning the index {a: 1} over the interval [0, 100000] in order to answer the query predicate {a: {$gte: 0, $lte: 100000}, b: "selectiveString"}. There is an alternative plan p2 which simply looks up all docs where "b" is equal to some rare string, "selectiveString", in the {b: 1} index. Clearly p2 is the better plan: instead of scanning some large index range and filtering out all the documents without the correct value for "b", we do a lookup based on the more selective predicate. However, depending on the data, the plan ranking algorithm described above might lead us to choose suboptimal plan p1 instead of p2. Suppose that there is a correlation between the "a" and "b" fields so that all of the documents where "b" is "selectiveString" also have an "a" value of 0. Since p1 is scanning index {a: 1} in order, we encounter all the documents where "a" is 0 and "b" is "selectiveString" first during the trial execution period. This can spuriously make p1 look as good as p2! If instead we were to sample randomly from the data where "a" is on the interval [0, 10000], the query engine would have quickly observed that plan p1 is much slower than p2. Real data is often correlated, so despite the fairly contrived example, this can be a problem is practice. |
| Comments |
| Comment by Xiaochen Wu [ 18/Aug/22 ] |
|
can be reproduced by:
coll = client.test.indextie coll.delete_many({}) indexes = coll.index_information() coll.create_index([("a", pymongo.ASCENDING)]) data = [] for i in range(1000): for i in range(1000): coll.insert_many(data)
it should use index b but it a was used because it is created first. |
| Comment by Hoyt Ren [ 18/Mar/16 ] |
|
I tried v3.2.4, still see the problem. |
| Comment by Hoyt Ren [ 03/Jan/16 ] |
|
I tried v3.2, the problem is still there. |
| Comment by Hoyt Ren [ 07/Dec/15 ] |
|
Yes, it's a good direction to try. I believe the difficult is that don't affect the query performance much. Maybe, we can use (or put) some statics in (or into) index? By the way, I found the issue about manually control the selection of index, but it seems a long time that nobody update it. |
| Comment by David Storch [ 04/Dec/15 ] |
|
Here's one idea of how to fix this problem. We could extend the MultiPlanStage / CachedPlanStage trial period indefinitely. Instead of buffering the first batch of results and then throwing them out if we have to re-plan the query, we could keep a running estimate of index selectivity. If at any point during query execution the selectivity drops off, we could trigger a replan. This would probably involve buffering RecordIds for every document in the reset set so that we can dedup after a replan. There would be a few details to work out. For example, how many RecordIds would we be willing to store for each query? How would we deal with sorts? We would probably have to remember the sort key of the last document returned and throw out any results from the new plan on the wrong side of the sort. |
| Comment by Hoyt Ren [ 18/Nov/15 ] |
|
Hello friends, Since making an AI isn't easy, I see some people here mentioned, that allow manually selecting the index as traditional DB, I can't remember the issue ID. Will this be a quicker fix? I believe here isn't problem because all us use SQL in so long a time if we ask user input some additional parameters. However if mongo achieve an AI, it will be a great achievement as many others we can see from today's mongo. Really thank you for your exciting works. |