-
Type:
Improvement
-
Resolution: Fixed
-
Priority:
Major - P3
-
Affects Version/s: None
-
Component/s: None
-
None
-
Fully Compatible
-
135
-
None
-
None
-
None
-
None
-
None
-
None
-
None
Ad-hoc tests suggest that we can improve performance of last point queries by changing our query rewrite.
Consider the following user-specified query:
db.telemetry.aggregate([
{$sort: {"metadata.sensorId": 1, "timestamp": 1}},
{$group: {
_id: "$metadata.sensorId",
ts: {$last: "$timestamp"},
temp: {$last: "$temp"}
}}
]);
which we usually would rewrite to:
db.system.buckets.telemetry.aggregate([
{$sort: {"meta.sensorId": 1, "control.max.timestamp": -1}},
{$group: {
_id: "$meta.sensorId",
bucket: {$first: "$_id"},
}},
{$lookup: {
from: "system.buckets.telemetry",
foreignField: "_id",
localField: "bucket",
as: "bucket_data",
pipeline:[
{$_internalUnpackBucket: {
timeField:"timestamp",
metaField:"tags",
bucketMaxSpanSeconds:NumberInt("60")
}},
{$sort: {"timestamp": -1}},
{$limit:1}
]
}},
{$unwind: "$bucket_data"},
{$replaceWith:{
_id: "$_id",
ts: "$bucket_data.timestamp",
temp: "$bucket_data.temp"
}}
]);
We actually can get the same results with slightly better runtime by avoiding the $lookup using the following alternative rewrite:
db.system.buckets.telemetry.aggregate([
{$sort: {"meta.sensorId": 1, "control.max.timestamp": -1}},
{$group: {
_id: "$meta.sensorId",
bucket: {$first: "$_id"},
control: {$first: "$control"},
meta: {$first: "$meta"},
data: {$first: "$data"}
}},
{$_internalUnpackBucket: {
timeField:"timestamp",
metaField:"meta",
bucketMaxSpanSeconds:NumberInt("60")
}},
{$sort: {"meta.sensorId": 1, "timestamp": -1}},
{$group: {
_id: "$meta.sensorId",
ts: {$first: "$timestamp"},
temp: {$first: "$temp"}
}}
]);
This optimization was suggested by a comment from david.percy on the tech design for PM-2330:
The tweak described above improved runtime from 210ms to 140ms in my tests with a debug build on the following data set: https://gist.github.com/starzia/9d1f8a25a2e2e2124b78e2da71159602
However, genny tests showed every larger larger latency improvements – more than a 7x speedup
- is duplicated by
-
SERVER-61659 TS Last Point opt: final test plan review
-
- Closed
-