Investigate if we can track how often a sub system fails

XMLWordPrintableJSON

    • Type: Task
    • Resolution: Unresolved
    • Priority: Major - P3
    • None
    • Affects Version/s: None
    • Component/s: Statistics
    • None
    • Storage Engines - Persistence
    • None
    • None

      The idea is to investigate if we can track how often a sub system fails.

      We do have specific stats (i.e cache_evict_split_failed_lock, cache_eviction_blocked_multi_block_reconciliation_during_checkpoint) when a function fails at a verify specific location in the code. However, can we track how many failures we have in a sub system? This could help us know is a sub system is fragile, gets errors often but handle them gracefully. It would also help us track down when an issue occurred and its time to detection.

            Assignee:
            [DO NOT USE] Backlog - Storage Engines Team
            Reporter:
            Etienne Petrel
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated: