Support lexical prefilters for vector search

XMLWordPrintableJSON

    • Type: Epic
    • Resolution: Unresolved
    • Priority: Unknown
    • None
    • Component/s: AI/ML
    • None
    • Support lexical prefilters for vector search
    • Needed
    • Customers will be able to index analyzed text fields as prefilters allowing for much richer search experiences to be built and the knnBeta operator to reach end of life.
    • Hide

      Summary of necessary driver changes

      •  

      Commits for syncing spec/prose tests
      (and/or refer to an existing language POC if needed)

      •  

      Context for other referenced/linked tickets

      •  
      Show
      Summary of necessary driver changes   Commits for syncing spec/prose tests (and/or refer to an existing language POC if needed)   Context for other referenced/linked tickets  
    • To Do
    • 0
    • 0
    • 0
    • 100
    • None
    • None
    • None
    • Builder Changes Needed
    • $i18n.getText("admin.common.words.hide")
      Key Status/Resolution FixVersion
      CSHARP-5770 Execution Blocked
      JAVA-5996 Execution Blocked
      PHPLIB-1739 Blocked
      $i18n.getText("admin.common.words.show")
      #scriptField, #scriptField *{ border: 1px solid black; } #scriptField{ border-collapse: collapse; } #scriptField td { text-align: center; /* Center-align text in table cells */ } #scriptField td.key { text-align: left; /* Left-align text in the Key column */ } #scriptField a { text-decoration: none; /* Remove underlines from links */ border: none; /* Remove border from links */ } /* Add green background color to cells with FixVersion */ #scriptField td.hasFixVersion { background-color: #00FF00; /* Green color code */ } #scriptField td.willNotDo { background-color: #FF0000; /* Red color code */ } /* Center-align the first row headers */ #scriptField th { text-align: center; } Key Status/Resolution FixVersion CSHARP-5770 Execution Blocked JAVA-5996 Execution Blocked PHPLIB-1739 Blocked
    • None
    • None
    • None
    • None
    • None
    • None

      Summary

      What is the problem or use case, what are we trying to achieve?

      We should provide a way for vector search to work with an analyzed text prefilter. This should function by having vectorSearch available as a top level operator within $search:

      [{‘$search.vectorSearch’: {...}}]

      Adding support for lexical prefilters for vector search with this syntax and improving $vectorSearch’s prefilter to work with more MQL operators and datatypes will allow us to migrate customers off of knnBeta and eventually EOL it, dramatically simplifying the getting started experience for users while also providing an on-ramp to more complex filtered vector search use cases that require an analyzed text field via Lucene.

      Motivation

      Customers have to use the deprecated $search.knnBeta operator to leverage analyzed text prefilters during $vectorSearch.

      Who is the affected end user?

      Who are the stakeholders?

      Developers who are building in more advanced filtering capabilities into their vector search to leverage fuzzy search/other analyzed text prefilters.

      How does this affect the end user?

      Are they blocked? Are they annoyed? Are they confused?

      They are forced to use deprecated knnBeta/knnVector syntax to take advantage of advanced filtering capabilities OR they accept restricted filtering capabilities in $vectorSearch, leading to a potential churn risk.

      How likely is it that this problem or use case will occur?

      Main path? Edge case?

      No known edge cases

      If the problem does occur, what are the consequences and how severe are they?

      Minor annoyance at a log message? Performance concern? Outage/unavailability? Failover can't complete?

      Is this issue urgent?

      Does this ticket have a required timeline? What is it?

      Yes, mongot is shipping with this feature anticipated 12/15 (always liable to change)

      Is this ticket required by a downstream team?

      Needed by e.g. Atlas, Shell, Compass?

      Yes

      Is this ticket only for tests?

      Is this ticket have any functional impact, or is it just test improvements?

      No

      Cast of Characters

      Engineering Lead: eugene.strizhnov@mongodb.com
      Document Author:
      POCers:
      Product Owner: henry.weller@mongodb.com
      Program Manager:
      Stakeholders:

      Channels & Docs

      Slack Channel: #lexical-prefilters-for-vector-search

      Scope Document

      Technical Design Document

      [Parent Epic|CLOUDP-252495]

            Assignee:
            Unassigned
            Reporter:
            Henry Weller
            None
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Created:
              Updated:
              None
              None
              None
              None
              None
              None