-
Type:
Improvement
-
Resolution: Fixed
-
Priority:
Major - P3
-
Affects Version/s: None
-
Component/s: None
-
None
-
Query Integration
-
Minor Change
-
None
-
3
-
TBD
-
None
-
None
-
None
-
None
-
None
-
None
-
0
Currently both hybrid search stages ($rankFusion & $scoreFusion) analyze the desugared version of there input pipelines (as Pipeline object).
This is not ideal because we have constraints for both input pipeline (that is that they must both be selection pipelines, and also be ranked/scored pipelines); but analyzing the desugared version makes it difficult to recognize these rules after some valid stages desugar into other stages (like $score).
Instead we should analyze the pre-desugared version of the input pipelines (that is the input pipeline parsed as a LiteParsedPipeline), and then later parse to a full Pipeline.
So for both $rankFusion and $scoreFusion we should:
- Parse the input pipeline to LiteParsedPipeline (here / here)
- Modify the input pipeline validator functions to analyze the LiteParsedPipeline
- After validation passes, then parse to full Pipeline, and all downstream logic for both stages should remain unaffected (ideally we would "upgrade" from a LiteParsedPipeline to a full Pipeline, but that functionality does not currently exist).
Furthermore, we should look for all opportunities to consolidate logic between $rankFusion and $scoreFusion in the document_source_hybrid_scoring_util files. Here is my suggestion:
- Have 3 util functions that all analyze a LiteParsedPipeline
Then the input pipeline validation functions can look like:
- $rankFusion: isSelectionPipeline && isRankedPipeline
- $scoreFusion: isSelectionPipeline && isScoredPipeline
Also remember to retain the generatesMetadataType(DocumentMetadataFields::kScore) check once a Pipeline is created (after LiteParsedPipeline input pipeline analysis).
This change should also set us up to support nested hybrid searches, if we later choose to support these. It might end up being trivial to support if this goes well.
Testing:
All existing tests in our hybrid search suites should continue to pass.
Add tests for placing a $score with $minMaxScaler normalization inside of $scoreFusion input pipeline. This should have already worked, but doesn't because $score with $minMaxScaler desugars into stages that don't currently pass input pipeline validation - so this change should also fix this case. Make sure to add tests with both $scoreFusion and $score requesting scoreDetails and not.
- is depended on by
-
SERVER-104730 Explicitly ban nested $rankFusions and $scoreFusions
-
- Open
-