[SERVER-85694] $searchMeta aggregation pipeline stage not passing correct query to mongot after PlanShardedSearch Created: 25/Jan/24  Updated: 08/Feb/24  Resolved: 05/Feb/24

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: 7.2.0
Fix Version/s: 7.2.1, 7.3.0-rc2

Type: Bug Priority: Critical - P2
Reporter: Evan Nixon Assignee: Charlie Swanson
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Backports
Problem/Incident
Related
related to SERVER-86097 Try not to call planShardedSearch on ... Needs Scheduling
is related to SERVER-78159 Merge DocumentSourceInternalSearchMon... Closed
Assigned Teams:
Query Execution
Backwards Compatibility: Fully Compatible
Operating System: ALL
Backport Requested:
v7.3, v7.2
Steps To Reproduce:

On a sharded cluster running 7.2.0:

Insert a document in collection "drug data":

{
  id: '12345',
  openfda: {
    manufacturer_name: 'CSS Pharmacy',
    route: [
      'ORAL'
    ]
  }
} 

Create a search index named "drugs" over collection "drug_data":

{
  "name": "drugs",
  "mappings": {
    "dynamic": false,
    "fields": {
      "id": {
        "analyzer": "lucene.keyword",
        "searchAnalyzer": "lucene.keyword",
        "type": "string"
      },
      "openfda": {
        "fields": {
          "manufacturer_name": [
            {
              "type": "string"
            },
            {
              "type": "token"
            },
            {
              "type": "stringFacet"
            }
          ],
          "route": [
            {
              "type": "string"
            },
            {
              "type": "stringFacet"
            }
          ]
        },
        "type": "document"
      }
    }
  }
} 

Run a query over collection "drug_data":

db.drug_data.aggregate([
  {
    '$searchMeta': {
      index: 'drugs',
      facet: {
        operator: {
          exists: {
            path: 'id'
          }
        },
        facets: {
          manufacturers: {
            type: 'string',
            path: 'openfda.manufacturer_name',
            numBuckets: 10
          },
          routes: {
            type: 'string',
            path: 'openfda.route',
            numBuckets: 10
          }
        }
      }
    }
  }
]) 

Observe error message:

MongoServerError: mongot returned an error :: caused by :: Query should contain either operator or collector 

Participants:

 Description   

In sharded 7.2 clusters, the object inside $searchMeta is not being correctly passed to shards after PlanShardedSearch

When running an aggregation pipeline like

[
  {
    '$searchMeta': {
      index: 'drugs',
      facet: {
        operator: {
          exists: {
            path: 'id'
          }
        },
        facets: {
          manufacturers: {
            type: 'string',
            path: 'openfda.manufacturer_name',
            numBuckets: 10
          },
          routes: {
            type: 'string',
            path: 'openfda.route',
            numBuckets: 10
          }
        }
      }
    }
  }
] 

We can see this line in logs (verbosity set to 5):

mongos_1   | {"t":{"$date":"2024-01-24T23:16:42.756+00:00"},"s":"D4", "c":"ASIO",     "id":22596,   "ctx":"conn63","msg":"startCommand","attr":{"request":"RemoteCommand 3610 -- target:[mongod3.internal:27017] db:ClinicalTrials cmd:{ aggregate: \"drug_data\", pipeline: [ { $searchMeta: { mongotQuery: { index: \"drugs\", facet: { operator: { exists: { path: \"id\" } }, facets: { manufacturers: { type: \"string\", path: \"openfda.manufacturer_name\", numBuckets: 10 }, routes: { type: \"string\", path: \"openfda.route\", numBuckets: 10 } } } }, metadataMergeProtocolVersion: 1, limit: 0, sortSpec: { $searchScore: -1 } } } ], cursor: { batchSize: 101 }, let: { NOW: { $literal: new Date(1706138202749) }, CLUSTER_TIME: { $literal: Timestamp(1706138199, 1) } }, fromMongos: true, readConcern: { level: \"local\", provenance: \"implicitDefault\" }, writeConcern: { w: \"majority\", wtimeout: 0, provenance: \"implicitDefault\" }, shardVersion: { e: ObjectId('000000000000000000000000'), t: Timestamp(0, 0), v: Timestamp(0, 0) }, databaseVersion: { uuid: UUID(\"d802bb99-99e8-459c-9cbe-f2f0022b87ba\"), timestamp: Timestamp(1706136857, 2), lastMod: 1 }, clientOperationKey: UUID(\"b007cb63-4766-4610-ad39-7581861db2b5\"), lsid: { id: UUID(\"b3530e27-f57f-47a4-8fc0-04a7cc551bed\"), uid: BinData(0, A009A5C38A39FA832F5D8E5FA067A58CC80A0E8F29A0F501773556AC0B1B33AD) } }"}} 

The relevant part of this line, post-formatting, is:

{
  pipeline: [
    {
      $searchMeta: {
        mongotQuery: {
          index: "drugs",
          facet: {
            operator: {
              exists: {
                path: "id"
              }
            },
            facets: {
              manufacturers: {
                type: "string",
                path: "openfda.manufacturer_name",
                numBuckets: 10
              },
              routes: {
                type: "string",
                path: "openfda.route",
                numBuckets: 10
              }
            }
          }
        },
        metadataMergeProtocolVersion: 1,
        limit: 0,
        sortSpec: {
          $searchScore: -1
        }
      }
    }
  ]
} 

Notice that in this pipeline, there is a "mongotQuery" object inside $searchMeta - this object should not exist. Keys+values inside this mongotQuery object should be children of the $searchMeta object.

This is making it so that users are unable to use search facets on sharded clusters on version 7.2.0.



 Comments   
Comment by Githook User [ 08/Feb/24 ]

Author:

{'name': 'Charlie Swanson', 'email': 'charlie.swanson@mongodb.com', 'username': 'cswanson310'}

Message: SERVER-85694 Ability to parse serialized output for $searchMeta

(cherry picked from commit ea677d2a0a6d94b798da18dfb269f679ff67df57)
Branch: v7.2
https://github.com/mongodb/mongo/commit/ddaeb72a9a1130c09ab79b4cb61634c457ceebc1

Comment by Githook User [ 02/Feb/24 ]

Author:

{'name': 'Charlie Swanson', 'email': 'charlie.swanson@mongodb.com', 'username': 'cswanson310'}

Message: SERVER-85694 Ability to parse serialized output for $searchMeta (#18633)

GitOrigin-RevId: 57e0107998c45c39dcd8b90ed97a99b326aaa5df
Branch: v7.3
https://github.com/mongodb/mongo/commit/ad6e38e8d7d1ea6c7287a38a09e5744a3110243e

Comment by Githook User [ 02/Feb/24 ]

Author:

{'name': 'Charlie Swanson', 'email': 'charlie.swanson@mongodb.com', 'username': 'cswanson310'}

Message: SERVER-85694 Ability to parse serialized output for $searchMeta (#18633)

GitOrigin-RevId: b9e1075e2679bc029ac593afc365f5b0b0ef46bc
Branch: master
https://github.com/mongodb/mongo/commit/ea677d2a0a6d94b798da18dfb269f679ff67df57

Comment by Zixuan Zhuang [ 27/Jan/24 ]

Sending back to evan.nixon@mongodb.com as this has been out of my knowledge now.

Generated at Thu Feb 08 06:58:22 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.