Change desugaring of $rankFusion and $scoreFusion to wrap first pipeline in a $unionWith

XMLWordPrintableJSON

    • Type: Task
    • Resolution: Won't Do
    • Priority: Major - P3
    • None
    • Affects Version/s: None
    • Component/s: None
    • None
    • Query Integration
    • None
    • 3
    • TBD
    • None
    • None
    • None
    • None
    • None
    • None

      Right now hybrid search stages desugar into pipelines like:

      [
        <input_pipeline_1>,
        $unionWith: {<input_pipeline_2>},
        $unionWith: {<input_pipeline_3>}
        ...
        <merging_stages>
      ]

      Change both $rankFusion and $scoreFusion to wrap the first input pipeline in the same $unionWith as well like:

      [
        $unionWith: {<input_pipeline_1>},
        $unionWith: {<input_pipeline_2>},
        $unionWith: {<input_pipeline_3>}
        ...
        <merging_stages>
      ] 

      The high level reasoning for this is that we will no longer have to consider both some pipeline running in a $unionWith and outside of a $unionWith - the behavior of all input pipelines will be consistent in that they will desugar into sub-pipelines of a $unionWith.

       

      For example right now, two hybrid searches that are identical other than that their input pipelines are reordered, will desugar into different versions of some pipeline not being in a $unionWith - while all needing to produce identical results.

       

      This should increase modularity and reduce cognitive load of reasoning about how hybrid search stages desugar.

       

      One concrete instance of this, is enabling search input pipelines in a $rankFusion on a view. Desugaring how we do right now, the produced query may or may not be a search pipeline depending only on if the first input pipeline is a view. Change this to the new strategy, a hybrid search pipeline will never be a top-level search on a view, and the search on a view should always be handled inside each $unionWith (which should already be handled). Its very possible that after this change, with the changes for running a $rankFusion on a view for non-search input pipelines, no additional changes may need to be made to support search input pipelines. We will test this after this change.

       

      Another concrete instance of this advantage, is when/if we tackle running all input pipelines in parallel, we will only need to parallelize the sub-pipeline execution of a $unionWith, and hybrid search should start executing its input pipelines in parallel out of the box in all cases. 

            Assignee:
            Unassigned
            Reporter:
            Joe Shalabi
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

              Created:
              Updated:
              Resolved: