[SERVER-68577] Assess $group queries perf at the bucket level Created: 04/Aug/22  Updated: 29/Oct/23  Resolved: 10/Aug/22

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: 6.1.0-rc0

Type: Task Priority: Major - P3
Reporter: Pawel Terlecki Assignee: Rui Liu
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Backwards Compatibility: Fully Compatible
Participants:

 Description   

The idea is to measure performance of the following query:

{$group: {_id:"ticker", m: {$min: {price}}}}

in two scenarios:

  • against a TS collection. All the data will be unpacked before grouping.
  • against the corresponding bucket collection. Buckets will not be unpacked at all and min will be calculated based on mins stored in buckets. This requires rewriting this query to an equivalent version working with buckets.

The point of the experiment is to give us an idea of the performance boost, as we can extrapolate what the boost would be if a portion of buckets needs to be unpacked, e.g. due to a filter on time. We expect orders of magnitude.

At the moment we believe that queries that conceptually do not need to unpack a portion of buckets are common. We can confirm that using telemetry in the future.


Generated at Thu Feb 08 06:11:11 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.