[LangChainJS] add Auto Embedding support

XMLWordPrintableJSON

    • 2
    • Not Needed
    • None
    • Hide

      1. What would you like to communicate to the user about this feature?
      2. Would you like the user to see examples of the syntax and/or executable code and its output?
      3. Which versions of the driver/connector does this apply to?

      Show
      1. What would you like to communicate to the user about this feature? 2. Would you like the user to see examples of the syntax and/or executable code and its output? 3. Which versions of the driver/connector does this apply to?
    • None
    • None
    • None
    • None
    • None
    • None

      Context

      MongoDB Atlas is rolling out a new Auto Embedding Index feature that automates vector generation for text fields, eliminating the need for external embedding pipelines. See NODE-7245

      LangChain.js is a modular SDK that abstracts the complexity of LLM orchestration. It integrates with MongoDB as a Vector Store, letting you run semantic searches inside your existing Atlas clusters.

      We want LangChainJS to support the new Auto Embedding feature to provide developers with an even smoother vector search experience.

      Sample Usage

      Current user flow:

      // 0. Assuming a Vector Search index is already created in MongoDB Atlas
      
      // 1. Configure your MongoDB connection
      const collection = new MongoClient(process.env.MONGODB_URI!)
          .db("my_ecommerce_db")
          .collection("products");
      
      // 2. Instantiate a VectorStore
      const vectorStore = new MongoDBAtlasVectorSearch(
          new VoyageEmbeddings({
          modelName: "voyage-4-large",
          apiKey: process.env.VOYAGEAI_API_KEY!,
      }), { collection });
      
      // 3. Add documents to the VectorStore
      await vectorStore.addDocuments([ /* Your documents here */ ]);
      
      // 4. Perform a similarity search
      const results = await vectorStore.similaritySearch("Your product text search query");
      console.log(results);
      
      

      After:

      // The other steps stay the same...
      
      // 2. Instantiate a VectorStore .
      // The first param (`embeddings`) will be optional now.
      const vectorStore = new MongoDBAtlasVectorSearch({ collection });
      
      

      Expected User Experience

      We want to update the methods inside the class MongoDBAtlasVectorSearch to reflect the changes from the new Auto Embedding feature. See the annotated class definition below:

      declare class MongoDBAtlasVectorSearch extends VectorStore {
      // -----------------------------------------------
      // The following methods/constructors should be updated.
      // The `embeddings` parameter will be optional and
      // Auto embeddings will be used when not provided.
      // -----------------------------------------------
      
        constructor(
          embeddings: EmbeddingsInterface,        // Optional
          args: MongoDBAtlasVectorSearchLibArgs,
        );
      
        static fromTexts(
          texts: string[],
          metadatas: object[] | object,
          embeddings: EmbeddingsInterface,        // Optional
          dbConfig: MongoDBAtlasVectorSearchLibArgs & {
            ids?: string[];
          },
        ): Promise<MongoDBAtlasVectorSearch>;
      
        static fromDocuments(
          docs: Document[],
          embeddings: EmbeddingsInterface,        // Optional
          dbConfig: MongoDBAtlasVectorSearchLibArgs & {
            ids?: string[];
          },
        ): Promise<MongoDBAtlasVectorSearch>;
        
      
      // ------------------------------------------------------
      // The following methods should fail when auto embeddings
      // are used since they would be redundant/conflicting
      // but are still required by the `VectorStore` interface.
      // ------------------------------------------------------
      
        addVectors(
          vectors: number[][],
          documents: Document[],
          options?: { ids?: string[] },
        ): Promise<any[]>;
      
        similaritySearchVectorWithScore(
          query: number[],
          k: number,
          filter?: MongoDBAtlasFilter,
        ): Promise<[Document, number][]>;
      
      // --------------------------------------------------------
      // The following methods should keep working as expected.
      // They might be using auto embeddings under the hood now.
      // --------------------------------------------------------
      
        addDocuments(
          documents: Document[],
          options?: { ids?: string[] },
        ): Promise<any[]>;
      
        delete(params: { ids: any[] }): Promise<void>;
      
        static fixArrayPrecision(array: number[]): number[];
      
        similaritySearch(
          query: string,
          k?: number,
          filter?: this["FilterType"] | undefined,
          _callbacks?: Callbacks | undefined,
        ): Promise<DocumentInterface[]>;
      
        similaritySearchWithScore(
          query: string,
          k?: number,
          filter?: this["FilterType"] | undefined,
          _callbacks?: Callbacks | undefined,
        ): Promise<[DocumentInterface, number][]>;
      
        asRetriever(
          kOrFields?: number | Partial<VectorStoreRetrieverInput<this>>,
          filter?: this["FilterType"],
          callbacks?: Callbacks,
          tags?: string[],
          metadata?: Record<string, unknown>,
          verbose?: boolean,
        ): VectorStoreRetriever<this>;
      
        maxMarginalRelevanceSearch(
          query: string, 
          options: MaxMarginalRelevanceSearchOptions<this["FilterType"]>
        ): Promise<Document[]>; 
      }
      

      References

            Assignee:
            Raschid Jimenez
            Reporter:
            Gaurab Aryal
            None
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

              Created:
              Updated: