[SERVER-55418] report docsExamined for $search queries Created: 22/Mar/21  Updated: 27/Dec/23

Status: Backlog
Project: Core Server
Component/s: Logging
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Minor - P4
Reporter: Harshad Dhavale Assignee: Backlog - Query Integration
Resolution: Unresolved Votes: 0
Labels: qi-search
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
Assigned Teams:
Query Integration
Operating System: ALL
Steps To Reproduce:
  1. Run Atlas Search queries that has $search and another stage following it (such as $match). 
  2. Inspect the logged entry in the mongod log (you may need to enable profiling to capture all queries, if the query runs in less than 100ms).
  3. Check the docsExamined and numYields metrics.
Participants:

 Description   

------------------------
Issue Description
------------------------

Note: As this pertains to logging queries in the mongod log, I have filed this under the JIRA under the SERVER project. If this is incorrect, please move to the appropriate project.

When you run an Atlas Search query that has $search and another stage following it (such as $match), the corresponding entry that gets logged for it in the mongod logs and which gets captured in the Atlas Profiler appears to be inaccurate, and it shows:

  "keysExamined": 0,
  "docsExamined": 0,
  
  ...
  
  "numYields": 0,
  "nreturned": 1,

The "keysExamined": 0 makes sense, because no MongoDB index is used for the query. However, the "docsExamined": 0 is incorrect because, AFAIK, when you run an Atlas Search query, the mongod makes an API call to the mongot which returns the objectId and metada for the matched documents back to mongod. Then mongod will then fetch those documents that match that objectId and metadata, and return them to the user. Ergo, "docsExamined" should not be 0.

Also, the "numYields", which is a counter that reports the number of times the operation has yielded to allow other operations to complete, appears to be 0 even for long running operations, which does not seem accurate.

Note: The keysExamined and docsExamined appear only in the entries logged in the mongod log and the Profiler, for the Atlas Search query. If you run an explain plan with executionStats or allPlansExecution, it will not show the aforementioned metrics.

------------------------
Reproducer
------------------------

1. I set my cluster's profiling level to capture all queries in the mongod logs.
2. Then, I ran the following query several times on the cluster:

db.hotels1234.explain("allPlansExecution").aggregate([
   {
     "$search": {
       "index": "dynamic_index",
       "wildcard": {
         "path": {"wildcard": "*"},
         "query": "Ful*",
         "allowAnalyzedField": true
       }
     }
   },
   { $match: { "name" : "Room1" }}
])

3. The above generated the following entry in the mongod log as well as the Profiler:

2021-03-22T15:55:49.338+0000 I  COMMAND  [conn331648] command fts_tests.hotels1234 appName: "MongoDB Shell" 
command: aggregate { 
aggregate: "hotels1234", 
pipeline: [ 
{ $search: { index: "dynamic_index", wildcard: { path: { wildcard: "*" }, query: "Ful*", allowAnalyzedField: true } } }, 
{ $match: { name: "Room1" } } ], 
cursor: {}, lsid: { id: UUID("78ca0a0d-56c8-4567-bc24-8bd7453dcf17") }, $clusterTime: { clusterTime: Timestamp(1616428538, 1), signature: { hash: BinData(0, 5A04F7969C5405962D0150A48F293912C65431AD), keyId: 6910165722823917570 } }, $db: "fts_tests" } mongot: { cursorid: 0, timeWaitingMillis: 11 } 
keysExamined:0 
docsExamined:0 
cursorExhausted:1 
numYields:0 
nreturned:1 
reslen:309 locks:{ ReplicationStateTransition: { acquireCount: { w: 7 } }, Global: { acquireCount: { r: 7 } }, Database: { acquireCount: { r: 7 } }, Collection: { acquireCount: { r: 7 } }, Mutex: { acquireCount: { r: 7 } } } storage:{} protocol:op_msg 15ms

4. I ran explain("executionStats") which did not have keysExamined and docsExamined:

MongoDB Enterprise > db.hotels1234.explain("executionStats").aggregate([
...    {
...      "$search": {
...        "index": "dynamic_index",
...        "wildcard": {
...          "path": {"wildcard": "*"},
...          "query": "Ful*",
...          "allowAnalyzedField": true
...        }
...      }
...    },
...    { $match: { "name" : "Room1" }}
... ])
{
  "stages" : [
    {
      "$_internalSearchBetaMongotRemote" : {
        "mongotQuery" : {
          "index" : "dynamic_index",
          "wildcard" : {
            "path" : {
              "wildcard" : "*"
            },
            "query" : "Ful*",
            "allowAnalyzedField" : true
          }
        },
        "explain" : {
          "type" : "BooleanQuery",
          "args" : {
            "must" : [ ],
            "mustNot" : [ ],
            "should" : [
              {
                "type" : "MultiTermQueryConstantScoreWrapper",
...
              {
                "type" : "MultiTermQueryConstantScoreWrapper",
                "args" : {
                  "queries" : [
                    {
                      "type" : "DefaultQuery",
                      "args" : {
                        "queryType" : "WildcardQuery"
                      },
                      "stats" : {
                        "context" : {
                          "nanosElapsed" : NumberLong(0)
                        },
                        "match" : {
                          "nanosElapsed" : NumberLong(0)
                        },
                        "score" : {
                          "nanosElapsed" : NumberLong(0)
                        }
                      }
                    }
                  ]
                },
                "stats" : {
                  "context" : {
                    "nanosElapsed" : NumberLong(208006),
                    "invocationCounts" : {
                      "createWeight" : NumberLong(1),
                      "createScorer" : NumberLong(3)
                    }
                  },
                  "match" : {
                    "nanosElapsed" : NumberLong(0)
                  },
                  "score" : {
                    "nanosElapsed" : NumberLong(1172),
                    "invocationCounts" : {
                      "score" : NumberLong(1)
                    }
                  }
                }
              },
...



 Comments   
Comment by Katya Kamenieva [ 31/Mar/21 ]

I see the same with just $search stage without following $match. Should it be different with $match and without?

{
    "t": {
        "$date": "2021-03-31T15:15:08.730+00:00"
    },
    "s": "I",
    "c": "COMMAND",
    "id": 51803,
    "ctx": "conn43774",
    "msg": "Slow query",
    "attr": {
        "type": "command",
        "ns": "sample_airbnb.listingsAndReviews",
        "appName": "MongoDB Automation Agent v10.27.2.6803 (git: d770be870a4653b32d66666945381b3d433897c9)",
        "command": {
            "aggregate": "listingsAndReviews",
            "maxTimeMS": 15000,
            "pipeline": [
                {
                    "$search": {
                        "text": {
                            "query": "mountain ",
                            "path": "name"
                        }
                    }
                },
                {
                    "$limit": 20
                },
                {
                    "$limit": 20
                }
            ],
            "cursor": {},
            "lsid": {
                "id": {
                    "$uuid": "f902a87e-d6cb-4712-9496-f6e38cb329ae"
                }
            },
            "$clusterTime": {
                "clusterTime": {
                    "$timestamp": {
                        "t": 1617203706,
                        "i": 1
                    }
                },
                "signature": {
                    "hash": {
                        "$binary": {
                            "base64": "FDPNJIFl3D+Pany3DQ70nEn9EVo=",
                            "subType": "0"
                        }
                    },
                    "keyId": 6886524444710273025
                }
            },
            "$db": "sample_airbnb",
            "$readPreference": {
                "mode": "primaryPreferred"
            }
        },
        "mongot": {
            "cursorid": 0,
            "timeWaitingMillis": 5
        },
        "keysExamined": 0,
        "docsExamined": 0,
        "cursorExhausted": true,
        "numYields": 0,
        "nreturned": 7,
        "reslen": 136169,
        "locks": {
            "ReplicationStateTransition": {
                "acquireCount": {
                    "w": 15
                }
            },
            "Global": {
                "acquireCount": {
                    "r": 15
                }
            },
            "Database": {
                "acquireCount": {
                    "r": 15
                }
            },
            "Collection": {
                "acquireCount": {
                    "r": 15
                }
            },
            "Mutex": {
                "acquireCount": {
                    "r": 15
                }
            }
        },
        "storage": {},
        "protocol": "op_msg",
        "durationMillis": 5
    }
}

Comment by Kyle Suarez [ 30/Mar/21 ]

kateryna.kamenieva and marcus.eagan, could you please comment on the relative priority of this compared to the rest of the Query Execution projects?

Comment by Marcus Eagan (Inactive) [ 22/Mar/21 ]

cc doug.tarr

Generated at Thu Feb 08 05:36:25 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.