PriorityTicketHolder doesn't track operations that requeue after 500millis

XMLWordPrintableJSON

    • Type: Improvement
    • Resolution: Won't Do
    • Priority: Major - P3
    • None
    • Affects Version/s: None
    • Component/s: None
    • None
    • Storage Execution
    • None
    • 3
    • None
    • None
    • None
    • None
    • None
    • None

      It could be interesting to track the number of operations that time out at 500 milliseconds in a queue, wake up, and requeue for a ticket.

      Motivation: It could provide insight into what conditions cause the operations to get stuck in the queue & the side effects on latency and throughput when operations must wakeup to requeue.

      Example: Suppose 50th percentile latency is ~500 milliseconds, do we see higher tail latencies than expected? should we reconsider the 500 milliseconds timeout?

      Right now, we measure the number of cumulative number operations queued in the PriorityTicketHolder at the TicketHolderWithQueueingStats level. This means, it does not take into account the number of items that must requeue.

            Assignee:
            [DO NOT USE] Backlog - Storage Execution Team
            Reporter:
            Haley Connelly
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated:
              Resolved: