Steady state time-series workloads can experience spikes in activity caused by buckets closing at the same time

XMLWordPrintableJSON

    • Type: Task
    • Resolution: Unresolved
    • Priority: Major - P3
    • None
    • Affects Version/s: None
    • Component/s: None
    • None
    • Storage Execution
    • Storage Execution 2025-06-23, Storage Execution 2025-07-07
    • None
    • 3
    • TBD
    • None
    • None
    • None
    • None
    • None
    • None
    • 0

      Given a steady state workload where workers are inserting measurements into a number of buckets at the same time, assuming that the buckets get created at around the same time, the buckets will be closed out at around the same time as well.

      For example, if three workers are inserted measurements into 3 different buckets with granularity seconds for metadata values A, B, C and they all start inserting measurements at around 10:00, upon inserting measurements after 11:00 all of the buckets will be closed out due to the latest measurements being outside of the bucket max span.

      As a result of this we may see:

      • Increased bucket catalog activity when allocating a number of new buckets all at once
      • When run with TTL indexes, spikes in CPU usage due to TTL Monitor activity as a result of all of these buckets becoming eligible for deletion at the same time

      One way to get around this could be to introduce some degree of randomness into when the buckets close. For example, buckets could close out when we try to insert a measurement at time (bucket start time + (bucketMaxSpan - randomlyGeneratedOffset)).

            Assignee:
            Henrik Edin
            Reporter:
            Damian Wasilewicz
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

              Created:
              Updated: