Loading...

XML

Word

Printable

JSON

Type: Task
Resolution: Fixed
Priority: Major - P3
Fix Version/s: 7.2.0-rc0
Affects Version/s: None
Component/s: None
Labels:
None

Backwards Compatibility:
Fully Compatible
Linked BF Score:
135
Confidence Status:
None
Work Order:
3
CAR Domain/s:
None

Aha! Reference:
None
Tracking Level:
None
Risk Status:
None
Exec Notes:
None
Goal Name(s):
None
Goal Link:
None

We think the estimation code for the hash_agg stage is not suitable (or at least unwieldy) for window functions, for a few reasons:

1. It defines a memory checkpoint for spilling, and the checkpoint counter is incremented per record processed. However, in window function project, we have two parts of memory (window buffer, and window states) and they are updated at different paces. This makes it hard to define memory checking that is checkpoint-based.

2. The checkpoint calculation references the spilling threshold, we cannot easily transfer this to our context, because the memory is divided into two parts.

3. Most importantly, in the hash_agg case, we assume the hash table size is either stable or linearly growing. We don't have this guarantee in our context. For example, for a range-based window, the window frame size might differ drastically for each record. I'm not sure how to define a reasonable checkpoint in this case.

We should come up with an actual model of the memory estimation, as a function of the window frame / window buffer size. It's reasonable to assume the model is linear (including those states with constant size). We propose the following:

We estimate memory for the window buffer and window states differently. The window buffer memory is estimated by the average of each record sample. We simply multiply the average by the number of records in the window buffer.

We don't have access to the delta memory change for the window state, and we can only get the memory for the entire state. So instead we use each memory sample to calculate a linear regression model, where x is the size of the window frame, and y is the memory size of the window state.

The memory samples are taken in an exponential backoff way, every one record / frame size, then every two record / frame size, every four and so on, up to a certain maximum interval. This is also what hash_agg stage does with a configurable query knob.

Assignee:: Rui Liu
Reporter:: Rui Liu
Participants:: Githook User, Rui Liu
Votes:: 0 Vote for this issue
Watchers:: 3 Start watching this issue

Created:: Oct 05 2023 07:09:18 PM UTC
Updated:: Oct 25 2023 03:19:05 PM UTC
Resolved:: Oct 25 2023 03:19:05 PM UTC
Confidence Status Last Update:: 06/Oct/23 11:14 AM

Details

Description

Attachments

Activity

People

Dates