Make $rankFusion support $vectorSearch-like extension stages

XMLWordPrintableJSON

    • Type: Task
    • Resolution: Unresolved
    • Priority: Major - P3
    • None
    • Affects Version/s: None
    • Component/s: None
    • Query Integration
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      Context

      SERVER-128285 added featureFlagExtensionsInsideHybridSearch and shipped an initial batch of tests in jstests/extensions/extension_in_hybrid_search.js, jstests/noPassthrough/query/hybrid_search_on_sharded_view.js, and jstests/noPassthrough/query/hybrid_search_in_unionwith_on_sharded_view.js. Those tests cover the happy path and basic rejection cases for extensions inside \$rankFusion/\$scoreFusion on an unsharded plain collection and on sharded views.

      This ticket covers the remaining \$rankFusion-specific gaps: topologies, view combinations, and cross-cutting concerns not included in that initial batch.


      What already exists (do not duplicate)

      File What it covers
      jstests/extensions/extension_in_hybrid_search.js Selection ext (\$matchTopN) in \$rankFusion (allowed); transforming ext (\$addFieldsMatch) rejected; multi-stage non-selection-tail ext (\$nativeVectorSearch) rejected. All unsharded, plain collection.
      jstests/noPassthrough/query/hybrid_search_on_sharded_view.js \$scoreFusion top-level on a sharded view. No \$rankFusion equivalent.
      jstests/noPassthrough/query/hybrid_search_in_unionwith_on_sharded_view.js \$rankFusion and \$scoreFusion inside \$unionWith on a sharded view, with plain stages in the input pipelines (no extensions inside the hybrid input pipelines).
      jstests/with_mongot/e2e/hybridSearch/ranked_fusion_on_view.js \$rankFusion on a view with mongot \$search/\$vectorSearch pipelines. Requires a real mongot.
      jstests/with_mongot/e2e/hybridSearch/rank_fusion_in_union_with_lookup_view.js \$rankFusion inside \$unionWith/\$lookup with views, using mongot.

      Tests to implement

      All new tests should use mocha-lite style (describe/it/before from jstests/libs/mochalite.js) matching the style of extension_in_hybrid_search.js.

      Required tags for all new tests:

      /**
      * @tags: [
      * featureFlagExtensionsAPI,
      * featureFlagExtensionsInsideHybridSearch,
      * featureFlagRankFusionFull,
      * featureFlagSearchHybridScoringFull,
      * requires_fcv_82,
      * ]
      */
      

      Test 1 — \$rankFusion top-level on an unsharded view

      File: jstests/extensions/rank_fusion_on_unsharded_view.js

      What to test: Run \$rankFusion with plain \$sort-based input pipelines directly against a view namespace on a standalone/unsharded mongod. This mirrors hybrid_search_on_sharded_view.js but without ShardingTest.

      Setup:

      • Create a collection coll with documents {_id, x, y}.
      • Create a view collView over coll with [\{$match: \{x: \{$gte: 0\}\}\}].

      Test case: Run \$rankFusion with pipelines a: [\{$sort: \{x: -1\}\}] and b: [\{$sort: \{y: -1\}\}] against collView. Assert the command succeeds and returns all documents satisfying the view filter.


      Test 2 — \$rankFusion inside a \$lookup subpipeline targeting a view

      File: jstests/extensions/rank_fusion_in_lookup_on_view.js

      What to test: \$rankFusion placed inside a \$lookup subpipeline where from: names a view. This covers the \$lookup code path that the sharded \$unionWith test does not. No mongot required; use \$sort-only input pipelines.

      Setup:

      • Collection outer with documents {_id}.
      • Collection base with documents {_id, x, y}.
      • View baseView over base with a simple \$match filter.

      Test cases:

      1. Unsharded: db.outer.aggregate([{$lookup: {from: "baseView", as: "ranked", pipeline: [\{$rankFusion: ...\}]])}}. Assert the command succeeds and each outer document has a ranked array containing the expected view documents.
      2. Sharded: Same pipeline using ShardingTest with 2 shards. Shard base by {_id: 1}. Place this variant in jstests/noPassthrough/query/ and add requires_sharding to the tags.

      Test 3 — Selection extension in \$rankFusion input pipeline when running against a view

      File: jstests/extensions/rank_fusion_extension_on_view.js

      What to test: The combination of (a) a selection extension (\$matchTopN) in a \$rankFusion input pipeline and (b) the overall query running against a view namespace. This exercises the view-resolution + LP-desugaring code path introduced in SERVER-128285 end-to-end.

      Setup: Same collection shape as extension_in_hybrid_search.js (docs with {_id, x, y}). Create a view testView over the collection with [\{$match: \{x: \{$gte: 0\}\}\}].

      Test cases:

      1. \$matchTopN in \$rankFusion against a view — allowed: Run \$rankFusion with pipeline a: [\{$matchTopN: \{filter: \{x: \{$gt: 2\}\}, sort: \\{x: -1\}, limit: 3\}\}] against testView. Assert it succeeds and returns the same result as replacing \$matchTopN with its manual expansion [\{$match\}, \\{$sort\}, \\{$limit\}].
      2. Transforming extension (\$addFieldsMatch) in \$rankFusion against a view — rejected: Assert the command fails with code 12108704 and the error message names \$addFieldsMatch.

      Test 4 — Selection extension inside \$rankFusion that is itself inside a \$unionWith

      File: jstests/extensions/rank_fusion_with_extension_in_unionwith.js

      What to test: The full composition: \$matchTopN in a \$rankFusion input pipeline, where that \$rankFusion lives inside a \$unionWith targeting a view. This is the combination that hybrid_search_in_unionwith_on_sharded_view.js (no extensions) and extension_in_hybrid_search.js (no \$unionWith) each cover half of.

      Setup:

      • Collection outer with one document.
      • Collection base with {_id, x, y} documents.
      • View baseView over base with a \$match filter.

      Test cases:

      1. Unsharded: db.outer.aggregate([{$unionWith: {coll: "baseView", pipeline: [{$rankFusion: {input: {pipelines: {a: [\{$matchTopN...\}], b: \\{$sort...}}}]}}])}}. Assert result count = 1 (outer doc) + however many the \$rankFusion returns from the view. Assert the \$unionWith results match running the same \$rankFusion directly against baseView.
      2. Sharded: Same pipeline with ShardingTest (2 shards, base sharded by {_id: 1}). Place in jstests/noPassthrough/query/rank_fusion_with_extension_in_unionwith_sharded.js.

      Test 5 — \$nativeVectorSearch rejection and \$addFieldsMatch rejection in a sharded cluster

      File: jstests/noPassthrough/query/rank_fusion_extension_rejection_sharded.js

      What to test: The rejection tests in extension_in_hybrid_search.js are unsharded. Validate that LP-time validation also fires correctly on a sharded cluster (router-side, not per-shard).

      Setup: ShardingTest with 2 shards. Shard coll by {_id: 1}.

      Test cases:

      1. \$addFieldsMatch (transforming extension) in a \$rankFusion input pipeline → assert fails with code 12108704 and error message names \$addFieldsMatch.
      2. \$nativeVectorSearch in a \$rankFusion input pipeline → assert fails with code 12108704 and error message names \$nativeVectorSearch.

      Assert that both rejections occur at parse/LP time (error should not reference a shard name).

      Additional tags: requires_sharding


      Test 6 — Both \$rankFusion input pipelines contain selection extensions simultaneously

      File: Add a new it block to jstests/extensions/extension_in_hybrid_search.js inside the existing describe.

      What to test: All input pipelines, not just pipeline "a", contain \$matchTopN. Validates that the all_of check across all input pipelines is correct and does not short-circuit after the first pipeline.

      Test case:

      coll.aggregate([{ $rankFusion: {
      input: { pipelines: {
      a: [\{ $matchTopN: { filter: {x: {$gt: 2}}, sort: \{x: -1}, limit: 3 } }],
      b: [\{ $matchTopN: { filter: {y: {$gt: 20}}, sort: \{y: -1}, limit: 3 } }],
      } }
      } }])
      

      Assert it succeeds and returns a non-empty result. Optionally assert the result matches the manual expansion of both \$matchTopN stages.


      Test 7 — \$rankFusion on a multi-level view chain (view-on-view)

      File: jstests/extensions/rank_fusion_on_nested_view.js

      What to test: \$rankFusion run against a view whose viewOn is itself another view (2-level chain). Exercises the recursive view-resolution code path.

      Setup:

      db.createView("level1View", collName, [\{ $match: { x: { $gte: 1 } } }]);
      db.createView("level2View", "level1View", [\{ $addFields: { fromLevel2: true } }]);
      

      Test case: Run \$rankFusion (with \$sort-based pipelines) against level2View and assert:

      • The command succeeds.
      • All returned documents have fromLevel2: true.
      • Result count matches documents in coll satisfying both view filters.

      Test 8 — Explain output for \$rankFusion with extension stages

      File: jstests/extensions/rank_fusion_extension_explain.js

      What to test: explain() on a pipeline containing \$rankFusion with \$matchTopN in an input pipeline. Ensures extension stages serialize correctly in explain output and that desugared internals are not unexpectedly exposed.

      Test cases:

      1. coll.explain().aggregate([{$rankFusion: {input: {pipelines: {a: [\$matchTopNStage], b: [\{$sort: \{y:1\}\}]}}])}}. Assert explain() succeeds and the output contains a stages or queryPlanner field. Assert no internal stage names (e.g. \$_internal*) leak into the top-level serialized form.
      2. Same but run against a view namespace (db.testView.explain().aggregate(...)). Assert the explain succeeds and the output reflects the view's underlying collection.

            Assignee:
            Finley Lau
            Reporter:
            Mariano Shaar
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

              Created:
              Updated: