-
Type:
Improvement
-
Resolution: Unresolved
-
Priority:
Major - P3
-
None
-
Affects Version/s: None
-
Component/s: None
-
Query Optimization
-
None
-
None
-
None
-
None
-
None
-
None
-
None
The pipeline dependency graph eagerly computes field-level and stage-level path dependencies.
This accounts for 65% of time spent building the graph in some tests (profiling PipelineOptimizationBMFixture/BM_BuildDependencyGraph/1000), due to the various APIs we need to call - DocumentSource::getDependencies, Expression::getDependencies.
However, we don't need this information - it is likely often computed, discarded and re-computed when the graph resizes.
We only need the dependency information for:
- getDeadFields - for dead code elimination
- SERVER-127536 - Pipeline::getDependencies replacement
We can create a fast-path in the graph, by specifying whether we care about field-level dependencies. If we don't, we can bypass the DepsTracker and use a conservative (with RNG + wholeDoc deps) constant FieldDependencies.
In this way, we will avoid tracking dependencies during normal pipeline rewrites which might not care about them.
We should only make this change if it actually appears that graph building is too slow and we haven't added other code which requires the field-level deps.
- is related to
-
SERVER-127536 Implement Pipeline::getDependencies using the dependency graph
-
- Backlog
-