[SERVER-83712] Consider removing exact number of elements in $in from plan cache key Created: 29/Nov/23 Updated: 18/Jan/24 |
|
| Status: | Open |
| Project: | Core Server |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Improvement | Priority: | Major - P3 |
| Reporter: | Ivan Fefer | Assignee: | Backlog - Query Execution |
| Resolution: | Unresolved | Votes: | 2 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||||||
| Assigned Teams: |
Query Execution
|
||||||||||||||||
| Sprint: | QE 2023-12-11, QE 2023-12-25, QE 2024-01-08 | ||||||||||||||||
| Participants: | |||||||||||||||||
| Story Points: | 3 | ||||||||||||||||
| Description |
|
In SBE plan cache key, we encode the number of elements in $in in it: This is done because of explode for sort SBE plan will have the number of branches equal to the number of elements in $in. However, this will lead to worse plan caching even for queries, where is no plan caching. It is a fact that we have users that use $in's with arbitrary amount of elements from 1 to thousands. Also, if a query have multiple $ins, it may lead to exponential growth. One suggestion is instead of generating a unique plan for every $in size when exploding for sort, we can generate plans for all powers of 2 (or 4) and use the smallest plan that can fit all $in values. |
| Comments |
| Comment by David Storch [ 01/Dec/23 ] |
|
The simpler fix ivan.fefer@mongodb.com referred to above is happening under related ticket |
| Comment by Ivan Fefer [ 30/Nov/23 ] |
|
I decided to first implement simple fix to at least provide a finite bind to plan cache growth, before working on a more complex fix. |
| Comment by Ivan Fefer [ 30/Nov/23 ] |
|
I decided to first implement simple fix to at least provide a finite bind to plan cache growth, before working on a more complex fix. |
| Comment by Ivan Fefer [ 30/Nov/23 ] |
|
One problem is that even if we change explodeForSort code to pad the plan with empty IXSCANs, we can't do the same for $in size, as $in doesn't translate one-to-one to the number of exploded scans. Example `db.find({a: {$in: [1, 2, 3]}, b: {$in: [4, 5, 6]]}}).sort({c: 1})` - will explode into 9 IXSCANs. And $in size is not the only things that affect the number of IXSCANS - there are also $or. So bucketing solution is not correct, but proposed simple fix should be correct at least. |
| Comment by Ivan Fefer [ 29/Nov/23 ] |
|
Simple, but less efficient fix would be to add std::min(<$in size>, internalQueryMaxScansToExplode) when calculating plan cache key. |