Uploaded image for project: 'Python Integrations'
  1. Python Integrations
  2. INTPYTHON-264

MongoDB LangChain & LlamaIndex QoL Improvements

    • Type: Icon: Epic Epic
    • Resolution: Done
    • Priority: Icon: Unknown Unknown
    • None
    • Affects Version/s: None
    • Component/s: AI/ML
    • None
    • Python Drivers
    • Hide
      1. What would you like to communicate to the user about this feature?
      2. Would you like the user to see examples of the syntax and/or executable code and its output?
      3. Which versions of the driver/connector does this apply to?
      1. What would you like to communicate to the user about this feature? 2. Would you like the user to see examples of the syntax and/or executable code and its output? 3. Which versions of the driver/connector does this apply to?
    • Done
    • MongoDB Langchain & LlamaIndex Integration
    • 3
    • 14
    • 14
    • 100
    • Hide

      Engineer(s): Jib Adegunloye, Noah Stapp, Casey Clements


      Final Project Update


      • What was accomplished since the last update?
        • All pull requests have been merged and the langchain-mongodb version has been bumped to reflect these new changes.
      • Is there any follow-up work that we'll need to address in the future?
        • Nothing scoped as result of this work. We will continue to foster our relationship with these libraries, ergo demanding new work.
      • What, if anything, did we de-scope from the project?
        • De-scoped additional performance improvements as that is an ongoing effort, similarly, that requires non-blocking performance tests. The performance improved scoped here empirically improve performance for large datasets.


      • All dev is complete. 2 pull-requests to LangChain are imminent.
      • What was completed over the last two weeks?
        • Llama-Index integration testing was merged by the Maintainer.
          • The integrations that we run in CI now point to upstream:main instead of Casey's feature branch.
        • Intermittent failures in Integration tests no longer show up for tests that point to Atlas Cloud.
        • DocArray integration was merged (1 of 2)
      • What's the focus over the next two weeks?
        • Langchain: Merge remaining two PRs


      Summary: AI/ML Test Pipeline improvements and merged ChatGPT-Retrieval!

      • What was completed over the last two weeks?
        • ChatGpt-Retrieval code has successfully been merged
        • Testing pipeline has gotten more expansive
          • Llama-Index testing introduced 
          • Now supports multiple index creations
        • Identified a bug in the Atlas CLI that would cause intermittent failures on evergreen runs
          • Reached out an provided a mitigation by using podman directly
          • Task filed to track the work: PYTHON-4391
      • What's the focus over the next two weeks?
        • Continue to make changes to Llama-Index and provide a notebook
        • Introduce the experimental code to allow developers to create vector search indexes via pymongo code: https://github.com/langchain-ai/langchain/pull/19359
        • Performance enhancement through batch size increase in Langchain code.


      Summary:  AI/ML Pipeline Testing:  additions and documentation

      • What was completed over the last two weeks?
        • DBX Tech Talk!
          • Specifically addresses Quality of Life. Fair to say that was the main theme. 
          • Also helps pave the way for other drivers to onboard their integrations.
        • Intermittent failures have been fixed for the following, all of which had separate issues
          • chatgpt-retrieval-plugin
          • llama-index
          • semantic-kernel-python
      • What's the focus over the next two weeks?
        • Rationalize epics so that we are not tracking everything here.
      • Impediments encountered over the last two weeks
        • We have a failing dotnet/csharp driver. We should open a ticket and speak to Boris.
        • Still waiting on upstream maintainers.


      Summary:  AI/ML Pipeline Testing:  additions and documentation

      • What was completed over the last two weeks?
        • Added LlamaIndex to AI/ML Pipeline Testing
        • Added Documentation of Git Patch Files technique
        • Set Casey up to use Atlas Local deployments
        • Investigated intermittent failures in CI
      • What's the focus over the next two weeks?
        • DBX Tech Talk on AI-ML Integrations
        • Jib on vacation
        • Create MongoDB Llama Pack template
      • Impediments encountered over the last two weeks
        • No response from chatgpt-retrieval-plugin maintainer. Has been one month.

      Summary: Continuing work on LangChain and LlamaIndex implementations


      • What was completed over the last two weeks?
        • All Langchain-MongoDB packages successfully included
        • All Langchain-MongoDB added Example guides of library usage
      • What's the focus over the next two weeks?
        • Adding LLamaIndex to the AI/ML test Pipeline
        • Updating AI/ML Testing Pipeline Documentation
      • Impediments encountered over the last two weeks
        • Reviewer timelines
        • Local Atlas in evergreen doesn't support vectorSearch type. Needs to be updated.

      Engineer(s): Jib Adegunloye, Noah Stapp, Casey Clements
      Summary: Follow-up improvements to the Langchain and LlamaIndex python library integrations.
      2024-03-01: Target date set to to account of review feedback loop
      • What was completed over the last two weeks?
          • Iteration on this project starts this week
      • What's the focus over the next two weeks?
          • Integrating the LangChain LLM Caching Layer
          • Any tasks related to the LangChain library
          • Adding LLamaIndex to the AI/ML Pipeline
      • Impediments encountered over the last two weeks
          • N/A
      • Open Dependencies
          • Maintainers of the LangChain & LlamaIndex code may push back timelines. 

      Engineer(s): Jib Adegunloye, Noah Stapp, Casey Clements
      Summary: Continuing work on LangChain and LlamaIndex implementations
      2024-02-24: Target date set to 
      • What was completed over the last two weeks?
          • MongoDB is officially a LangChain partner package

          • Getting review on MongoDB LLM Cache
      • What's the focus over the next two weeks?
          • Updating the test suite run for LangChain
          • Addressing review comments
          • Adding LLamaIndex to the AI/ML Pipeline
      • Impediments encountered over the last two weeks
          • N/A
      • Open Dependencies
          • Maintainers of the LangChain & LlamaIndex code may push back timelines. 

      Engineer(s): Jib Adegunloye, Noah Stapp, Casey Clements
      Summary: Continuing work on LangChain and LlamaIndex implementations
      2024-03-13: Target date set to 
      • What was completed over the last two weeks?
          • Merged MongoDB LangChain partner package

          • Uploaded PR to LangChain/MongoDB LLM Cache
      • What's the focus over the next two weeks?
          • Updating the test suite of the MongoDB Cache Layer
          • Addressing review comments
          • Adding LLamaIndex to the AI/ML Pipeline
      • Impediments encountered over the last two weeks
          • Reviewer timelines
      • Open Dependencies
          • Maintainers of the LangChain & LlamaIndex code may push back timelines. 


      Engineer(s): Jib Adegunloye, Noah Stapp, Casey Clements
      Summary: Continuing work on LangChain and LlamaIndex implementations
      2024-03-13: Target date set to 
      • What was completed over the last two weeks?
          • Merged MongoDB LangChain partner package

          • Uploaded PR to LangChain/MongoDB LLM Cache
      • What's the focus over the next two weeks?
          • Updating the test suite of the MongoDB Cache Layer
          • Addressing review comments
          • Adding LLamaIndex to the AI/ML Pipeline
      • Impediments encountered over the last two weeks
          • Reviewer timelines
      • Open Dependencies
          • Maintainers of the LangChain & LlamaIndex code may push back timelines. 

      Engineer(s): Jib Adegunloye, Noah Stapp, Casey Clements 2024-05-15 Final Project Update Summary:  What was accomplished since the last update? All pull requests have been merged and the langchain-mongodb version has been bumped to reflect these new changes. Is there any follow-up work that we'll need to address in the future? Nothing scoped as result of this work. We will continue to foster our relationship with these libraries, ergo demanding new work. What, if anything, did we de-scope from the project? De-scoped additional performance improvements as that is an ongoing effort, similarly, that requires non-blocking performance tests. The performance improved scoped here empirically improve performance for large datasets. 2024-05-13 Summary: All dev is complete. 2 pull-requests to LangChain are imminent. What was completed over the last two weeks? Llama-Index integration testing was merged by the Maintainer. The integrations that we run in CI now point to upstream:main instead of Casey's feature branch. Intermittent failures in Integration tests no longer show up for tests that point to Atlas Cloud. DocArray integration was merged (1 of 2) What's the focus over the next two weeks? Langchain: Merge remaining two PRs 2024-04-26 Summary: AI/ML Test Pipeline improvements and merged ChatGPT-Retrieval! What was completed over the last two weeks? ChatGpt-Retrieval code has successfully been merged Testing pipeline has gotten more expansive Llama-Index testing introduced  Now supports multiple index creations Identified a bug in the Atlas CLI that would cause intermittent failures on evergreen runs Reached out an provided a mitigation by using podman directly Task filed to track the work: PYTHON-4391 What's the focus over the next two weeks? Continue to make changes to Llama-Index and provide a notebook Introduce the experimental code to allow developers to create vector search indexes via pymongo code: https://github.com/langchain-ai/langchain/pull/19359 Performance enhancement through batch size increase in Langchain code. 2024-04-12 Summary:  AI/ML Pipeline Testing:  additions and documentation What was completed over the last two weeks? DBX Tech Talk! Specifically addresses Quality of Life. Fair to say that was the main theme.  Also helps pave the way for other drivers to onboard their integrations. Intermittent failures have been fixed for the following, all of which had separate issues chatgpt-retrieval-plugin llama-index semantic-kernel-python What's the focus over the next two weeks? Rationalize epics so that we are not tracking everything here. Impediments encountered over the last two weeks We have a failing dotnet/csharp driver. We should open a ticket and speak to Boris. The task  test-semantic-kernel-csharp  in 'ai-ml-pipeline-testing' has failed! Still waiting on upstream maintainers. 2024-03-29 Summary:  AI/ML Pipeline Testing:  additions and documentation What was completed over the last two weeks? Added LlamaIndex to AI/ML Pipeline Testing Added Documentation of Git Patch Files technique Set Casey up to use Atlas Local deployments Investigated intermittent failures in CI What's the focus over the next two weeks? DBX Tech Talk on AI-ML Integrations Jib on vacation Create MongoDB Llama Pack template Impediments encountered over the last two weeks No response from chatgpt-retrieval-plugin maintainer. Has been one month . Summary: Continuing work on LangChain and LlamaIndex implementations 2024-03-13:  What was completed over the last two weeks? All Langchain-MongoDB packages successfully included All Langchain-MongoDB added Example guides of library usage What's the focus over the next two weeks? Adding LLamaIndex to the AI/ML test Pipeline Updating AI/ML Testing Pipeline Documentation Impediments encountered over the last two weeks Reviewer timelines Local Atlas in evergreen doesn't support vectorSearch type. Needs to be updated. Engineer(s): Jib Adegunloye, Noah Stapp, Casey Clements Summary: Follow-up improvements to the Langchain and LlamaIndex python library integrations. 2024-03-01: Target date set to to account of review feedback loop • What was completed over the last two weeks?     • Iteration on this project starts this week • What's the focus over the next two weeks?     • Integrating the LangChain LLM Caching Layer     • Any tasks related to the LangChain library     • Adding LLamaIndex to the AI/ML Pipeline • Impediments encountered over the last two weeks     • N/A • Open Dependencies     • Maintainers of the LangChain & LlamaIndex code may push back timelines.  Engineer(s): Jib Adegunloye, Noah Stapp, Casey Clements Summary: Continuing work on LangChain and LlamaIndex implementations 2024-02-24: Target date set to  • What was completed over the last two weeks?     • MongoDB is officially a LangChain partner package     • Getting review on MongoDB LLM Cache • What's the focus over the next two weeks?     • Updating the test suite run for LangChain     • Addressing review comments     • Adding LLamaIndex to the AI/ML Pipeline • Impediments encountered over the last two weeks     • N/A • Open Dependencies     • Maintainers of the LangChain & LlamaIndex code may push back timelines.  Engineer(s): Jib Adegunloye, Noah Stapp, Casey Clements Summary: Continuing work on LangChain and LlamaIndex implementations 2024-03-13: Target date set to  • What was completed over the last two weeks?     • Merged MongoDB LangChain partner package     • Uploaded PR to LangChain/MongoDB LLM Cache • What's the focus over the next two weeks?     • Updating the test suite of the MongoDB Cache Layer     • Addressing review comments     • Adding LLamaIndex to the AI/ML Pipeline • Impediments encountered over the last two weeks     • Reviewer timelines • Open Dependencies     • Maintainers of the LangChain & LlamaIndex code may push back timelines.  ----------------------------------------------------------------------------------- Engineer(s): Jib Adegunloye, Noah Stapp, Casey Clements Summary: Continuing work on LangChain and LlamaIndex implementations 2024-03-13: Target date set to  • What was completed over the last two weeks?     • Merged MongoDB LangChain partner package     • Uploaded PR to LangChain/MongoDB LLM Cache • What's the focus over the next two weeks?     • Updating the test suite of the MongoDB Cache Layer     • Addressing review comments     • Adding LLamaIndex to the AI/ML Pipeline • Impediments encountered over the last two weeks     • Reviewer timelines • Open Dependencies     • Maintainers of the LangChain & LlamaIndex code may push back timelines. 


      Tasks for the ongoing work of improving MongoDB integration in LangChain & LlamaIndex (Generative AI developer framework)


      Who is the affected end user?

      Who are the stakeholders?

      How does this affect the end user?

      Are they blocked? Are they annoyed? Are they confused?

      How likely is it that this problem or use case will occur?

      Main path? Edge case?

      If the problem does occur, what are the consequences and how severe are they?

      Minor annoyance at a log message? Performance concern? Outage/unavailability? Failover can't complete?

      Is this issue urgent?

      Does this ticket have a required timeline? What is it?

      Is this ticket required by a downstream team?

      Needed by e.g. Atlas, Shell, Compass?

      Is this ticket only for tests?

      Is this ticket have any functional impact, or is it just test improvements?

      Cast of Characters

      Engineering Lead:
      Document Author:
      Product Owner:
      Program Manager:

      Channels & Docs

      Slack Channel

      [Scope Document|some.url]

      [Technical Design Document|some.url]



            jib.adegunloye@mongodb.com Jib Adegunloye
            prakul.agarwal@mongodb.com Prakul Agarwal
            0 Vote for this issue
            2 Start watching this issue
