Uploaded image for project: 'Python Integrations'
  1. Python Integrations
  2. INTPYTHON-542

[LangChain] Support weights separately to 2 retrievers in hybrid retriever

    • Python Drivers

      Context

      User Shanorino opened this issue in https://github.com/langchain-ai/langchain-mongodb/issues/84.

      "In MongoDBAtlasHybridSearchRetriever, currently it's not possible to add weights separately to 2 retrievers (in other words, it's always 50-50).

      A workaround is to change vector_penalty and fulltext_penalty. However, tuning the RRF parameter k changes the shape of the reciprocal rank function, while multiplying by a weight is a simple linear scaling of the entire curve. These two approaches are not mathematically equivalent and lead to different effects on how each rank contributes to the final score.

      One thing to mention is that EnsembleRetriever in Langchain also implemented the weights (instead of changing the penalty constant k).

      Thus, I'd propose implementing weights to the class MongoDBAtlasHybridSearchRetriever by adding 2 more stages to the Mongo pipeline, so that it can be instantiated this way:

      retriever = MongoDBAtlasHybridSearchRetriever(
      vectorstore = vector_store,
      search_index_name = "search_index",
      top_k = 5,
      fulltext_weight=0.3,
      vector_weight=0.7,
      fulltext_penalty = 50,
      vector_penalty = 50
      )
      

      "

      Definition of done

      Expose the API to add separate weights.

      Pitfalls

      None

            Assignee:
            Unassigned Unassigned
            Reporter:
            steve.silvester@mongodb.com Steve Silvester
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Created:
              Updated: