[SERVER-65005] Avoid $lookup in time-series last point optimization Created: 28/Mar/22  Updated: 29/Oct/23  Resolved: 07/Apr/22

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: 6.0.0-rc0

Type: Improvement Priority: Major - P3
Reporter: Steve Tarzia Assignee: Alya Berciu
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Duplicate
is duplicated by SERVER-61659 TS Last Point opt: final test plan re... Closed
Problem/Incident
Backwards Compatibility: Fully Compatible
Participants:
Linked BF Score: 135

 Description   

Ad-hoc tests suggest that we can improve performance of last point queries by changing our query rewrite.

Consider the following user-specified query:

 

 db.telemetry.aggregate([
    {$sort: {"metadata.sensorId": 1, "timestamp": 1}},
    {$group: {
      _id: "$metadata.sensorId",
      ts: {$last: "$timestamp"},
      temp: {$last: "$temp"}
    }}
 ]);

which we usually would rewrite to:

 

db.system.buckets.telemetry.aggregate([
   {$sort: {"meta.sensorId": 1, "control.max.timestamp": -1}},
   {$group: {
      _id: "$meta.sensorId",
      bucket: {$first: "$_id"},
   }},
   {$lookup: {
      from: "system.buckets.telemetry",
      foreignField: "_id",
      localField: "bucket",
      as: "bucket_data",
      pipeline:[
         {$_internalUnpackBucket: {
            timeField:"timestamp",
            metaField:"tags",
            bucketMaxSpanSeconds:NumberInt("60")
         }},
         {$sort: {"timestamp": -1}},
         {$limit:1}
      ]
   }},
   {$unwind: "$bucket_data"},
   {$replaceWith:{
      _id: "$_id",
      ts: "$bucket_data.timestamp",
      temp: "$bucket_data.temp"
   }}
]);

We actually can get the same results with slightly better runtime by avoiding the $lookup using the following alternative rewrite:

 

 

db.system.buckets.telemetry.aggregate([
   {$sort: {"meta.sensorId": 1, "control.max.timestamp": -1}},
   {$group: {
      _id: "$meta.sensorId",
      bucket: {$first: "$_id"},
      control: {$first: "$control"},
      meta: {$first: "$meta"},
      data: {$first: "$data"} 
   }},
   {$_internalUnpackBucket: {
      timeField:"timestamp",
      metaField:"meta",
      bucketMaxSpanSeconds:NumberInt("60")
   }},
   {$sort: {"meta.sensorId": 1, "timestamp": -1}},
   {$group: {
      _id: "$meta.sensorId",
      ts: {$first: "$timestamp"},
      temp: {$first: "$temp"}
   }}
]);

 

This optimization was suggested by a comment from david.percy on the tech design for PM-2330:

https://docs.google.com/document/d/1JmOlJh8NmgUa1tDZyHElCk0YlAJSxW0gURm4qGYn5AU/edit?disco=AAAAVXqkyAo 

The tweak described above improved runtime from 210ms to 140ms in my tests with a debug build on the following data set: https://gist.github.com/starzia/9d1f8a25a2e2e2124b78e2da71159602 

However, genny tests showed every larger larger latency improvements – more than a 7x speedup



 Comments   
Comment by Githook User [ 07/Apr/22 ]

Author:

{'name': 'Alya Berciu', 'email': 'alya.berciu@mongodb.com', 'username': 'alyacb'}

Message: SERVER-65005 Use group instead of lookup for lastpoint rewrite
Branch: master
https://github.com/mongodb/mongo/commit/6270f8eab062bb441ce3acc67280227f0d390f7e

Comment by Steve Tarzia [ 30/Mar/22 ]

Actually, alya.berciu saw more than 80% improvement with this change on genny tests, so we are bumping the priority of this ticket and moving into the active Last Point project.

Generated at Thu Feb 08 06:01:39 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.