LangChain retriever migration

XMLWordPrintableJSON

    • Type: Epic
    • Resolution: Unresolved
    • Priority: Major - P3
    • None
    • Affects Version/s: None
    • Component/s: AI/ML, LangChain
    • None
    • LangChain retriever migration
    • Python Drivers
    • Not Needed
    • In Progress
    • 0
    • 0
    • 0
    • 100
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      Summary

      This project aims to refactor our existing LangChain retrievers into abstracted implementations that are framework-agnostic, allowing us to better support a variety of AI/ML frameworks and user stories.

      A retriever is defined as "an interface that returns documents given an unstructured query." [LangChain docs]. Notably, they only deal with retrieving and returning documents, not storing them.

      Motivation

      Over the course of 2024 and 2025, the Python team has focused our efforts on creating the cutting edge of retrieval tools such as Knowledge Graph, Parent Document, and our existing Vector/Hybrid Search offerings. Since LangChain appeared to be the only integration driving profit to MongoDB, we prioritized tickets that plugged these features into their library.

      Now that it's clear that LangChain has won the RAG framework race, they've moved on to the Agent race where every AI framework and LLM provider is competing to create workflows and platforms for users to create and manage their own AI agents. Many of these agentic applications utilize RAG (retrieval augmented generation), which in turn relies on retriever patterns just like the ones we've created for LangChain. Unfortunately, our retriever integrations weren't designed to be independent of LangChain, and for many other frameworks and competitors bringing LangChain dependencies into their offerings is a non-starter.

      To support a broader spread of AI/ML frameworks, user workflows, and potential independent AI offerings of our own, we need to uncouple our retrievers from LangChain.

      How does this affect the end user?

      Are they blocked? Are they annoyed? Are they confused?
      It must not affect end users. Existing LangChain workflows must continue to function unchanged.

      How likely is it that this problem or use case will occur?

      Main path? Edge case?
      N/A

      If the problem does occur, what are the consequences and how severe are they?

      Minor annoyance at a log message? Performance concern? Outage/unavailability? Failover can't complete?
      N/A

      Is this issue urgent?

      Does this ticket have a required timeline? What is it?
      We want to migrate our retrievers out of LangChain sooner rather than later to avoid another 1.0 upgrade situation.

      Is this ticket required by a downstream team?

      Needed by e.g. Atlas, Shell, Compass?
      No.

      Is this ticket only for tests?

      Is this ticket have any functional impact, or is it just test improvements?
      No.

      Cast of Characters

      Engineering Lead: jib.adegunloye@mongodb.com
      Document Author: noah.stapp@mongodb.com
      POCers:
      Product Owner:
      Program Manager:
      Stakeholders:

      Channels & Docs

      Slack Channel

      Scope Document

      [Technical Design Document|some.url]

            Assignee:
            Noah Stapp
            Reporter:
            Noah Stapp
            None
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

              Created:
              Updated:
              None
              None
              None
              None
              None
              None