|
The main parts of this change are:
- Parameterize the entire forest of MatchExpressions that can now be pushed down to SBE, instead of just the primary one. Before this feature's additional $match pushdowns, there was only a single MatchExpression per query in SBE, so the forest only contained one tree. Dealing with a true forest required refactoring a few things.
- Encode the new trees of the forest into the plan cache key.
- Bind the parameters for the new trees at bind-in time.
Additional related changes:
- Enforce the maxMatchExpressionParams limit (default 512) globally across the forest instead of only per tree.
- Stop parameterizing when the limit is reached. (The code existing before this project would actually parameterize the entire primary MatchExpression even if this created far more than 512 parameters, then roll it all back in a second step if it turned out to be more than the limit.)
- Eliminate revertMode (the rollback of previously created MatchExpression parameters if the limit was exceeded). There is no need to do this as, to avoid cache flooding, we won't cache plans that exceeded the limit, and a partially parameterized plan will still work correctly. (Rolling back would also be harder to do in a forest than just for a single tree, as it would need to revisit previously completed trees.) The prior need for this rollback was because the parameterization pass might have created orders of magnitude more parameters, which cause the bind-in phase to be slow, but with the current PR we stop creating parameters after 512 so this problem goes away.
- Consolidate the CQ-related checks for whether to parameterize into a new CanonicalQuery::shouldParameterizeSbe() method.
Code clarity and developer productivity naming improvements:
- Rename CanonicalQuery::_root and root() to _primaryMatchExpression and getPrimaryMatchExpression(). "root" was a confusing name for this because it is not the root of the query plan or execution tree; it is in fact part of the bottom leaf of the plan and execution trees which is the opposite of the root; it is not actually any kind of query tree, though it being part of CanonicalQuery made it sound like that's what it was; and its prior lack of uniqueness reduced developer productivity (grepping for 'root(' gets more than 1,300 hits in the codebase).
- Rename CanonicalQuery::init() to initCq() to improve uniqueness (grepping for 'init(' gets more than 3,000 hits).
- Rename CanonicalQuery::_pipeline and setPipeline() to _cqPipeline and setCqPipeline() to improve uniqueness (grepping for 'pipeline' gets more than 9,000 hits, and most of them are likely about aggregation pipelines, not the cq pushdown pipeline).
|