-
Type:
Improvement
-
Resolution: Unresolved
-
Priority:
Major - P3
-
None
-
Affects Version/s: None
-
Component/s: None
-
Query Integration
-
None
-
None
-
None
-
None
-
None
-
None
-
None
Even though we treat and speak of view resolution as a discrete step in the sequence of executing an aggregation, the implementation works recursively, with these steps:
- enter runAggregate()
- perform some initial operations/validation
- acquire information about the namespace from the catalog
- if the namespace is a view, go back to step (1) with the resolved view and new underlying namespace
- parse/prepare/optimize the pipeline
- convert pipelines to executors and execute the query
We resolve the view entirely at once (meaning if there are nested views, they're all resolved at once), so the max recursion depth is 1. That means we have the opportunity to remove the recursion and instead refactor the code in a way that mirrors the way we speak about view resolution as a discrete step:
- enter runAggregate()
- perform some initial operations/validation
- acquire information about the namespace from the catalog
- if the namespace is a view:
- resolve the view and get underlying namespace
- perform whichever operations/validations should be duplicated from (2) with the resolved pipeline
- acquire information about the underlying namespace from the catalog
- parse/prepare/optimize the pipeline
- convert pipelines to executors and execute the query
Why?
The primary benefits I see here are (1) having the implementation match the way we speak about view resolution as a discrete step and as a byproduct (2) being very clear about what operations do or don't need to be repeated once the view is resolved. Right now, any code placed at the top of runAggregate() will be run twice; but we should instead be intentional about which code actually needs to be run again after the view is resolved.
This change would not have been easy prior to SERVER-82228, but now that we have the AggregationExecutionContext tracking the agg state, this change is much more doable. I've posted a quick incomplete PoC here. Note the PoC assumes a chunk of SERVER-93536 will be done first too.
- is related to
-
SERVER-93536 Move ownership of agg request structures to aggregation execution context
-
- Backlog
-
-
SERVER-82228 Refactor run_aggregate with a AggregationPlan class
-
- Closed
-