Loading...

XML

Word

Printable

JSON

Type: Task
Resolution: Unresolved
Priority: Major - P3
Fix Version/s: None
Affects Version/s: None
Component/s: None
Labels:
None

Assigned Teams:

Query Integration
Confidence Status:
None
Work Order:
3
Size Category:
TBD

Aha! Reference:
None
Tracking Level:
None
Risk Status:
None
Exec Notes:
None
Goal Name(s):
None
Goal Link:
None

Estimated Weeks:
0

There are two categories of cases where a $scoreFusion input pipeline could have multiple stages that all provide their own scoreDetails:

Starts with a stage that provides scoreDetails ($search / $vectorSearch), then after has one or more $score stages
The pipeline has multiple $score stages (2+), not starting with a stage that provides scoreDetails

We could consider banning case 2 (arguably there should only ever need to be a single $score stage per input pipeline), but case 1 seems valid (For example, a user may want to run a $search or $vectorSearch in a sub-pipeline, but then modify the order of that input pipeline with their own custom expression).

Regardless, we need to define and implement/validate the behavior of these cases where more than one stage provides scoreDetails in the same input pipeline, and $scoreFusion is requested to produce scoreDetails.

There are a couple of approaches here:

Only the last stage that provides scoreDetails has its details propagated to $scoreFusion
1. This is likely the current behavior as we read the input pipeline's scoreDetails metadata, which should be overwritten by the last stage. This may also be our only option without further changes outside of $scoreFusion.
Somehow try to incorporate all stages scoreDetails for the input pipeline (likely in an array)

Unless (2) is easy to implement, we decided that were content with (1) at least in the initial release. There may be no code changes to get to the behavior of case (1), we just want to make sure we understand exactly what the behavior is, and have test cases committed.

The same cases should be considered for $rankFusion.

Testing:

Regardless of the behavior and approach we take, we should have tests (either asserting on query results, or asserting the query uasserts) for both case (1) and (2) defined at the top of the ticket. So include tests for queries like:

$rank/scoreFusion: {
  pipelines: {
    p1: {$search/$vectorSearch, ..., $score, ...},
    p2: ...
  }
}

$rank/scoreFusion: {
  pipelines: {
    p1: {$search, ..., $score, ..., $score, ...},
    p2: ...
  }
}

$rank/scoreFusion: {
  pipelines: {
    p1: {..., $score, ..., $score, ...},
    p2: ...
  }
  normalization: "none"
}

depends on

SERVER-106426 scoreDetails' field 'rawScore' is incorrect when $score references {$meta: 'score'}

Closed

is depended on by

SERVER-82020 Enable featureFlagSearchHybridScoring by default

Open

is related to

SERVER-104739 Add $scoreFusion/$rankFusion test(s) that have both a $search/$vectorSearch and a $score in the same input pipeline

Closed

Assignee:: Mariano Shaar
Reporter:: Joe Shalabi
Participants:: Joe Shalabi, Mariano Shaar
Votes:: 0 Vote for this issue
Watchers:: 2 Start watching this issue

Created:: May 28 2025 11:37:46 PM UTC
Updated:: Jun 26 2025 05:03:10 PM UTC
Confidence Status Last Update:: 26/Jun/25 5:03 PM

Details

Description

Attachments

Issue Links

Activity

People

Dates