-
Type:
New Feature
-
Resolution: Unresolved
-
Priority:
Critical - P2
-
None
-
Affects Version/s: None
-
Component/s: Index Management, TypeScript
This ticket was split from DRIVERS-3315, please see that ticket for a detailed description.
Overview
The server is rolling out a new Auto Embedding Index feature. This allows developers to automate vector generation for text fields, removing the need for external embedding pipelines.
Goal: Enable drivers to support the new autoEmbed field type and the updated vectorSearch query syntax.
Usage Example
// Index Definition: Create an auto-embedding index await collection.createSearchIndex({ name: 'product_auto_embed_idx', type: 'vectorSearch', definition: { fields: [ { type: 'autoEmbed', // New index type! modality: 'text', path: 'description', model: 'voyage-4' }, { type: 'filter', path: 'author' } ] } }); // Insert documents (no manual embeddings needed) await collection.insertOne({ description: 'Wireless headphones with noise canceling', author: 'TechCorp' }); // ============================================ // Manual embedding generation (No longer needed!!) // ============================================ // Previously, users had to: // // 1. Set up external embedding service // const openai = new OpenAI({ apiKey: process.env.OPENAI_API_KEY }); // // 2. Generate embeddings for documents // const docEmbedding = await openai.embeddings.create({ // model: 'text-embedding-ada-002', // input: 'Wireless headphones with noise canceling' // }); // // 3. Store embeddings in documents // await collection.updateOne( // { description: 'Wireless headphones with noise canceling' }, // { $set: { embedding: docEmbedding.data[0].embedding } } // ); // // 4. Generate embeddings for queries // const queryEmbedding = await openai.embeddings.create({ // model: 'text-embedding-ada-002', // input: 'audio equipment' // }); // // 5. Use vector in query // queryVector: queryEmbedding.data[0].embedding // ============================================ // Query: Use text query instead of vector const results = await collection.aggregate([ { $vectorSearch: { index: 'product_auto_embed_idx', path: 'description', query: { text: 'audio equipment' }, numCandidates: 100, limit: 5, } }, { $project: { description: 1, } } ]).toArray(); // Results: Retrieved documents with similarity scores // [ // { // description: 'Wireless headphones with noise canceling', // } // ]
Task Description
- Add TypeScript interfaces for search index field definitions (autoEmbed, filter, vector) and their union type. See specs
- Extend $vectorSearch stage interface to support query and model parameters, make queryVector optional. See specs
- Add test cases for auto-embedding index creation and query syntax
- Update documentation with auto-embedding examples
Acceptance Criteria
- Driver exposes TypeScript interface for the autoEmbed field type in search index definitions.
- Driver successfully creates search indexes containing autoEmbed field definitions on supported server versions.
- Driver exposes TypeScript interface for $vectorSearch queries using query and model parameters.
- Driver successfully executes $vectorSearch queries with the new query and model syntax on supported server versions.
References
- blocks
-
COMPASS-10113 Auto embedding Vector search support
-
- Needs Triage
-
- is depended on by
-
DRIVERS-3350 [AI-Frameworks] Auto embedding in Community Vector search
-
- Ready for Work
-
-
NODE-7340 [LangChainJS] add Auto Embedding support
-
- Investigating
-
- split from
-
DRIVERS-3315 Driver support for new syntax in Auto embedding in Community Vector search
-
- In Progress
-