-
Type:
Epic
-
Resolution: Unresolved
-
Priority:
Unknown
-
None
-
Component/s: AI/ML
-
None
-
Support lexical prefilters for vector search
-
Needed
-
Customers will be able to index analyzed text fields as prefilters allowing for much richer search experiences to be built and the knnBeta operator to reach end of life.
-
-
To Do
-
0
-
0
-
0
-
100
-
None
-
None
-
None
-
Builder Changes Needed
-
-
None
-
None
-
None
-
None
-
None
-
None
Summary
What is the problem or use case, what are we trying to achieve?
We should provide a way for vector search to work with an analyzed text prefilter. This should function by having vectorSearch available as a top level operator within $search:
[{‘$search.vectorSearch’: {...}}]
Adding support for lexical prefilters for vector search with this syntax and improving $vectorSearch’s prefilter to work with more MQL operators and datatypes will allow us to migrate customers off of knnBeta and eventually EOL it, dramatically simplifying the getting started experience for users while also providing an on-ramp to more complex filtered vector search use cases that require an analyzed text field via Lucene.
Motivation
Customers have to use the deprecated $search.knnBeta operator to leverage analyzed text prefilters during $vectorSearch.
Who is the affected end user?
Who are the stakeholders?
Developers who are building in more advanced filtering capabilities into their vector search to leverage fuzzy search/other analyzed text prefilters.
How does this affect the end user?
Are they blocked? Are they annoyed? Are they confused?
They are forced to use deprecated knnBeta/knnVector syntax to take advantage of advanced filtering capabilities OR they accept restricted filtering capabilities in $vectorSearch, leading to a potential churn risk.
How likely is it that this problem or use case will occur?
Main path? Edge case?
No known edge cases
If the problem does occur, what are the consequences and how severe are they?
Minor annoyance at a log message? Performance concern? Outage/unavailability? Failover can't complete?
Is this issue urgent?
Does this ticket have a required timeline? What is it?
Yes, mongot is shipping with this feature anticipated 12/15 (always liable to change)
Is this ticket required by a downstream team?
Needed by e.g. Atlas, Shell, Compass?
Yes
Is this ticket only for tests?
Is this ticket have any functional impact, or is it just test improvements?
No
Cast of Characters
Engineering Lead: eugene.strizhnov@mongodb.com
Document Author:
POCers:
Product Owner: henry.weller@mongodb.com
Program Manager:
Stakeholders:
Channels & Docs
Slack Channel: #lexical-prefilters-for-vector-search
Scope Document
Technical Design Document
[Parent Epic|CLOUDP-252495]
- is depended on by
-
PHPLIB-1739 Support lexical prefilters for vector search
-
- Blocked
-
- is related to
-
DRIVERS-3308 MongoDB Vector Search now supports vector search against nested embeddings and arrays of embeddings.
-
- Backlog
-
- related to
-
DRIVERS-3300 Support storedSource in vector search indexes and returnStoredSource in $vectorSearch queries
-
- Backlog
-
- split to
-
CSHARP-5770 Support lexical prefilters for vector search
-
- Execution Blocked
-
-
JAVA-5996 Support lexical prefilters for vector search
-
- Execution Blocked
-
-
PHPLIB-1739 Support lexical prefilters for vector search
-
- Blocked
-