[Feature] LangChainJS: add Auto Embedding support

XMLWordPrintableJSON

    • NODE-7340[LangChainJS] add Auto Embedding support
    • 3
    • 2
    • Not Needed
    • None
    • Not Needed

      Target Date: April 15th 2026 

      April 15th is the desired "soft" date to align with the Public Preview launch of Auto Embedding. Not a blocker.

      Context

      MongoDB Atlas is rolling out a new Auto Embedding Index feature that automates vector generation for text fields, eliminating the need for external embedding pipelines. See NODE-7245

      LangChain.js is a modular SDK that abstracts the complexity of LLM orchestration. It integrates with MongoDB as a Vector Store, letting you run semantic searches inside your existing Atlas clusters.

      We want LangChainJS to support the new Auto Embedding feature to provide developers with an even smoother vector search experience.

      Sample Usage

      Current user flow:

      // 0. Assuming a Vector Search index is already created in MongoDB Atlas
      
      // 1. Configure your MongoDB connection
      const collection = new MongoClient(process.env.MONGODB_URI!)
          .db("my_ecommerce_db")
          .collection("products");
      
      // 2. Instantiate a VectorStore
      const vectorStore = new MongoDBAtlasVectorSearch(
          new VoyageEmbeddings({
          modelName: "voyage-4-large",
          apiKey: process.env.VOYAGEAI_API_KEY!,
      }), { collection });
      
      // 3. Add documents to the VectorStore
      await vectorStore.addDocuments([ /* Your documents here */ ]);
      
      // 4. Perform a similarity search
      const results = await vectorStore.similaritySearch("Your product text search query");
      console.log(results);
      
      

      After:

      // The other steps stay the same...
      
      // 2. Instantiate a VectorStore .
      // The first param (`embeddings`) will be optional now.
      const vectorStore = new MongoDBAtlasVectorSearch({ collection });
      
      

      Expected User Experience

      We want to update the methods inside the class MongoDBAtlasVectorSearch to reflect the changes from the new Auto Embedding feature. See the annotated class definition below:

      declare class MongoDBAtlasVectorSearch extends VectorStore {
      // -----------------------------------------------
      // The following methods/constructors should be updated.
      // The `embeddings` parameter will be optional and
      // Auto embeddings will be used when not provided.
      // -----------------------------------------------
      
        constructor(
          embeddings: EmbeddingsInterface,        // Optional
          args: MongoDBAtlasVectorSearchLibArgs,
        );
      
        static fromTexts(
          texts: string[],
          metadatas: object[] | object,
          embeddings: EmbeddingsInterface,        // Optional
          dbConfig: MongoDBAtlasVectorSearchLibArgs & {
            ids?: string[];
          },
        ): Promise<MongoDBAtlasVectorSearch>;
      
        static fromDocuments(
          docs: Document[],
          embeddings: EmbeddingsInterface,        // Optional
          dbConfig: MongoDBAtlasVectorSearchLibArgs & {
            ids?: string[];
          },
        ): Promise<MongoDBAtlasVectorSearch>;
        
      
      // ------------------------------------------------------
      // The following methods should fail when auto embeddings
      // are used since they would be redundant/conflicting
      // but are still required by the `VectorStore` interface.
      // ------------------------------------------------------
      
        addVectors(
          vectors: number[][],
          documents: Document[],
          options?: { ids?: string[] },
        ): Promise<any[]>;
      
        similaritySearchVectorWithScore(
          query: number[],
          k: number,
          filter?: MongoDBAtlasFilter,
        ): Promise<[Document, number][]>;
      
      // --------------------------------------------------------
      // The following methods should keep working as expected.
      // They might be using auto embeddings under the hood now.
      // --------------------------------------------------------
      
        addDocuments(
          documents: Document[],
          options?: { ids?: string[] },
        ): Promise<any[]>;
      
        delete(params: { ids: any[] }): Promise<void>;
      
        static fixArrayPrecision(array: number[]): number[];
      
        similaritySearch(
          query: string,
          k?: number,
          filter?: this["FilterType"] | undefined,
          _callbacks?: Callbacks | undefined,
        ): Promise<DocumentInterface[]>;
      
        similaritySearchWithScore(
          query: string,
          k?: number,
          filter?: this["FilterType"] | undefined,
          _callbacks?: Callbacks | undefined,
        ): Promise<[DocumentInterface, number][]>;
      
        asRetriever(
          kOrFields?: number | Partial<VectorStoreRetrieverInput<this>>,
          filter?: this["FilterType"],
          callbacks?: Callbacks,
          tags?: string[],
          metadata?: Record<string, unknown>,
          verbose?: boolean,
        ): VectorStoreRetriever<this>;
      
        maxMarginalRelevanceSearch(
          query: string, 
          options: MaxMarginalRelevanceSearchOptions<this["FilterType"]>
        ): Promise<Document[]>; 
      }
      

      References

      Use Case

      As a... Node Driver and LangChainJs user
      I want... to use the Auto Embedding Index feature that automates vector generation for text fields
      So that... I can eliminate the need for external embedding pipelines

      User Experience

      • VoyageEmbeddings are optional and the user does not have to construct them manually
      • If the user constructs their own embeddings and passes them in, that experience will not change
      • If the user doesn't pass in embeddings, that will now work
      • If the user doesn't pass in embeddings but calls `addVectors` or `similaritySearchVectorWithScore`, those methods will now throw exceptions

      Dependencies

      • MongoDB must be version 8.2+ Community Edition
      • Unsure if we need to make an update to Node Driver, verify as part of work

      Risks/Unknowns

      • This is a cross-driver alignment ticket
      • A linked ticket is going to update our documentation on this subject [TICKET-TBD]

      Acceptance Criteria

      Implementation Requirements

      Testing Requirements

      • Existing LangChainJs tests are all passing
      • New LangChainJs tests verify that embeddings are optional and the behavior is the same as the existing tests
      • Add new tests to verify that the addVectors and similaritySearchVectorWithScore throw if embeddings are used

      Documentation Requirements

      • Update MongoDB tutorials/examples that talk about embeddings: DOCSP-56730 
      • Update LangChainJs tutorials for same: NODE-7507 

      Follow Up Requirements

      • None

            Assignee:
            Pavel Safronov
            Reporter:
            Gaurab Aryal
            None
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated: