Correctly fail Hybrid Search queries inside a $lookup/$unionWith on a timeseries collection.

XMLWordPrintableJSON

    • Type: Task
    • Resolution: Fixed
    • Priority: Major - P3
    • 8.3.0-rc0, 8.2.0-rc2
    • Affects Version/s: None
    • Component/s: None
    • None
    • Query Integration
    • Fully Compatible
    • v8.2
    • 200
    • None
    • 3
    • TBD
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      Hybrid Search queries are not support to be able to run on timeseries collections, and we should fail with a clear error message. See assertion for top-level hybrid searches on ts collections here

      However, when a Hybrid Search is inside a $lookup/$unionWith and the collection inside the $lookup/$unionWith is a timeseries, like:

       

      coll.aggregate([
        $lookup/$unionWith: {
          'coll': <timeseries>,
          'pipeline': {
             <$rankFusion/$scoreFusion>
           }
        }
      ])
      

      we don't catch that, and it will attempt to proceed the query - which may or not pass based on the type of hybrid search (I think an all non-search hybrid search may work). We also to fail these queries with a clear error message.

       

      For single node topologies, catching this should be easy. Inside DocumentSourceUnionWith/DocumentSourceLookup::createFromBSON, we should have the BSON of the incoming pipeline, and should be able to know that the collection is timeseries (we might need to start tracking if the query is on a ts collection in the ExpressionContext? not sure).

       

      On shared clusters (on an un-sharded collection), however, this is going to be more tricky however, because:

      • We only have the non-desugared query with the $rankFusion/$scoreFusion BSON preserved on mongos (gets desugared when sent to mongod)
      • We only know if its a timeseries collection on mongod

       

      So the (only identified) solution would be to add a hidden "_isHybridSearch" key to the $lookup/$unionWith IDL spec.  This would be analogous to the "view" key added to the $search spec we've interacted with before. We should also set the "InternalClient" flag in the same way to protect against improper user injection.

       

      For shared collections case, we will hit the retryOnViewError loop in mongos where it looks like well both know if the query is on timeseries, and have the original BSON of the hybrid search. If possible we should catch this case here too, to fail sooner on in this case.

       

              Assignee:
              Finley Lau
              Reporter:
              Joe Shalabi
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

                Created:
                Updated:
                Resolved: