Details
-
Bug
-
Resolution: Duplicate
-
Major - P3
-
None
-
None
-
None
-
None
-
Query Optimization
-
ALL
-
QO 2023-01-09, QO 2023-01-23, QO 2023-02-06
Description
I tried out a simple example of using $telemetry at version 038c67d99cda1fb242ce3b4dcaf331e459f3ff41 of the master branch. First, in order to enable workload telemetry collection in the server, I started it like so:
./mongod --setParameter internalQueryConfigureTelemetrySamplingRate=1000000
|
Here's a snippet from the mongo shell which reproduces the problem:
MongoDB Enterprise > db.c.find({a: {$gt: 3}})
|
MongoDB Enterprise > db.getSiblingDB("admin").aggregate([{$telemetry: {}}]).pretty()
|
{
|
"key" : {
|
"find" : {
|
"find" : "###",
|
"filter" : {
|
"###" : {
|
"###" : "###"
|
}
|
}
|
},
|
"namespace" : "test.c",
|
"applicationName" : "MongoDB Shell"
|
},
|
"metrics" : {
|
"lastExecutionMicros" : NumberLong(1961),
|
"execCount" : NumberLong(1),
|
"queryOptMicros" : {
|
"sum" : NumberLong(287),
|
"max" : NumberLong(287),
|
"min" : NumberLong(287),
|
"sumOfSquares" : NumberLong(82369)
|
},
|
"queryExecMicros" : {
|
"sum" : NumberLong(1961),
|
"max" : NumberLong(1961),
|
"min" : NumberLong(1961),
|
"sumOfSquares" : NumberLong(3845521)
|
},
|
"docsReturned" : {
|
"sum" : NumberLong(0),
|
"max" : NumberLong(0),
|
"min" : NumberLong(0),
|
"sumOfSquares" : NumberLong(0)
|
},
|
"docsScanned" : {
|
"sum" : NumberLong(1),
|
"max" : NumberLong(1),
|
"min" : NumberLong(1),
|
"sumOfSquares" : NumberLong(1)
|
},
|
"keysScanned" : {
|
"sum" : NumberLong(0),
|
"max" : NumberLong(0),
|
"min" : NumberLong(0),
|
"sumOfSquares" : NumberLong(0)
|
},
|
"firstSeenTimestamp" : Timestamp(1668636882, 0)
|
},
|
"asOf" : Timestamp(1668636883, 0)
|
}
|
...
|
The bug pertains to the value of the key field. As you can see, all field names and values are redacted, including the $gt. We know that we need to redact constants in the query since it may be PII or have data security/privacy considerations. I believe there is an active discussion about our behavior about redacting or anonymizing field names. But there is no doubt that we should be including the $gt in the output. Otherwise we know nothing about what the query actually was.
Note that the same problem occurs even if I enable internalQueryConfigureTelemetryFieldNameRedactionStrategy=sha256. In that case, the key looks like this:
"key" : {
|
"find" : {
|
"find" : "###",
|
"filter" : {
|
"ypeBEsobvcr6" : {
|
"Jt74XIwL/Ngm" : "###"
|
}
|
}
|
},
|
"namespace" : "test.c",
|
"applicationName" : "MongoDB Shell"
|
},
|
Another note: Are there any end-to-end tests which show that redaction is working as expected?
Attachments
Issue Links
- depends on
-
SERVER-73141 Generate query shape (literal redaction) for expressions in expression_leaf.h
-
- Closed
-
- related to
-
SERVER-71427 $telemetry returns multiple entries with the same key even though the corresponding queries were distinct shapes
-
- Closed
-