[SERVER-69774] Investigate caching prepared plans in SBE Created: 16/Sep/22  Updated: 14/Mar/23

Status: Backlog
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Task Priority: Major - P3
Reporter: Kyle Suarez Assignee: Backlog - Query Execution
Resolution: Unresolved Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
is related to SERVER-68960 Investigate performance of point inde... Closed
Assigned Teams:
Query Execution
Sprint: QE 2022-09-19
Participants:
Story Points: 20

 Comments   
Comment by Anna Wawrzyniak [ 29/Sep/22 ]

Caching prepared plans is viable, but may require invasive PlanCache changes to productionalize.

 

A prototype using side cache, without eviction:
https://github.com/10gen/mongo/tree/anna.wawrzyniak/idhack_prepared_cache_stage

 

For fast id hack queries tested with the poc:
before: 8571 qps, prepatation time: 28.30us
after: 9405 qps, preparation time: 20.15us

And the difference becomes more significant the larger the query plan is. 
See related document for idhack: https://docs.google.com/document/d/1iYWU2N8230N0ipSIXeRkb8lyH-_qZB-o7ucuADCSZCA/edit#

Investigated approaches: 

a) Store prepared plans in a LIFO pool that is stored as part of cached plan for plans eligible for pooling.

  • PlanCache::get api (for classic and sbe) returns a CachedPlanHolder. The holder is opinionated and implicitly performs a deep copy of the entry on creation.
    If entry were to contain the pool, the deep copy operation would no longer be well defined semantic (we don't want to deep copy the pool).
    Supporting that requires changing the PlanCache api to return the reference to CacheEntry instead and then let classic and sbe perform the cloning when needed. 
    In case of sbe, the cloning would be skipped if the plan can be obtained from the pool.
  • PlanCache assumes that cached plans are immutable and storing a mutable pool would need relaxing that constraint. Immutability is assumed in order to safely clone the plan stage and stage data outside of cache lock.  It is also assumed for budget estimation, where the size of the entry is precomputed at entry creation.
    Using a pool would require ability to refresh the entry size after pool is modified.
  • YieldPolicy is not part of the plan key, but may have impact on plan preparation. Unlike plan construction that does not use YieldPolicy. This may require either constraining the plan pooling to just one yield policy (NO_YIELD) or store separate pools per policy type.

b) Store prepared plans in a separate cache representing a map plankey -> pool

  • Requires coordinating of cache invalidation between main plan cache and plan pool. We need to avoid the situation where a plan was evicted from main cache, but we still used a plan from pool cache that was based on it. Could be done by either callbacks or version checks.
Generated at Thu Feb 08 06:14:21 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.