-
Type:
Task
-
Resolution: Unresolved
-
Priority:
Unknown
-
Affects Version/s: None
-
None
-
Python Drivers
Context
User Shanorino opened this issue in https://github.com/langchain-ai/langchain-mongodb/issues/84.
"In MongoDBAtlasHybridSearchRetriever, currently it's not possible to add weights separately to 2 retrievers (in other words, it's always 50-50).
A workaround is to change vector_penalty and fulltext_penalty. However, tuning the RRF parameter k changes the shape of the reciprocal rank function, while multiplying by a weight is a simple linear scaling of the entire curve. These two approaches are not mathematically equivalent and lead to different effects on how each rank contributes to the final score.
One thing to mention is that EnsembleRetriever in Langchain also implemented the weights (instead of changing the penalty constant k).
Thus, I'd propose implementing weights to the class MongoDBAtlasHybridSearchRetriever by adding 2 more stages to the Mongo pipeline, so that it can be instantiated this way:
retriever = MongoDBAtlasHybridSearchRetriever(
vectorstore = vector_store,
search_index_name = "search_index",
top_k = 5,
fulltext_weight=0.3,
vector_weight=0.7,
fulltext_penalty = 50,
vector_penalty = 50
)
"
Definition of done
Expose the API to add separate weights.
Pitfalls
None