Details
-
Improvement
-
Resolution: Fixed
-
Major - P3
-
None
-
None
-
None
-
Fully Compatible
-
135
Description
Ad-hoc tests suggest that we can improve performance of last point queries by changing our query rewrite.
Consider the following user-specified query:
db.telemetry.aggregate([
|
{$sort: {"metadata.sensorId": 1, "timestamp": 1}},
|
{$group: {
|
_id: "$metadata.sensorId",
|
ts: {$last: "$timestamp"},
|
temp: {$last: "$temp"}
|
}}
|
]);
|
which we usually would rewrite to:
db.system.buckets.telemetry.aggregate([
|
{$sort: {"meta.sensorId": 1, "control.max.timestamp": -1}},
|
{$group: {
|
_id: "$meta.sensorId",
|
bucket: {$first: "$_id"},
|
}},
|
{$lookup: {
|
from: "system.buckets.telemetry",
|
foreignField: "_id",
|
localField: "bucket",
|
as: "bucket_data",
|
pipeline:[
|
{$_internalUnpackBucket: {
|
timeField:"timestamp",
|
metaField:"tags",
|
bucketMaxSpanSeconds:NumberInt("60")
|
}},
|
{$sort: {"timestamp": -1}},
|
{$limit:1}
|
]
|
}},
|
{$unwind: "$bucket_data"},
|
{$replaceWith:{
|
_id: "$_id",
|
ts: "$bucket_data.timestamp",
|
temp: "$bucket_data.temp"
|
}}
|
]);
|
We actually can get the same results with slightly better runtime by avoiding the $lookup using the following alternative rewrite:
db.system.buckets.telemetry.aggregate([
|
{$sort: {"meta.sensorId": 1, "control.max.timestamp": -1}},
|
{$group: {
|
_id: "$meta.sensorId",
|
bucket: {$first: "$_id"},
|
control: {$first: "$control"},
|
meta: {$first: "$meta"},
|
data: {$first: "$data"}
|
}},
|
{$_internalUnpackBucket: {
|
timeField:"timestamp",
|
metaField:"meta",
|
bucketMaxSpanSeconds:NumberInt("60")
|
}},
|
{$sort: {"meta.sensorId": 1, "timestamp": -1}},
|
{$group: {
|
_id: "$meta.sensorId",
|
ts: {$first: "$timestamp"},
|
temp: {$first: "$temp"}
|
}}
|
]);
|
This optimization was suggested by a comment from david.percy on the tech design for PM-2330:
The tweak described above improved runtime from 210ms to 140ms in my tests with a debug build on the following data set: https://gist.github.com/starzia/9d1f8a25a2e2e2124b78e2da71159602
However, genny tests showed every larger larger latency improvements – more than a 7x speedup
Attachments
Issue Links
- is duplicated by
-
SERVER-61659 TS Last Point opt: final test plan review
-
- Closed
-