Capture Detailed Logical Initial Sync Metrics

XMLWordPrintableJSON

    • Replication
    • None
    • 3
    • TBD
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      There's very little visibility into the aggregate behavior and operation of logical initial syncs today it's very difficult to answer across a large number of clusters:

      • What is the normalized throughput of a logical initial sync
      • What is the success rate of logical initial syncs 
      • What is the typical duration of an initial sync
      • What phase of logical initial syncs do most failures occur
      • How many logical initial syncs are there on a given day

      To assist in answering these questions and others we should capture relevant metrics. We can take inspiration from resharding metrics.

      When complete it should be possible to build a funnel chart/diagram detailing clusters progress through the end-to-end logical initial sync process and charts detailing the performance of logical initial syncs.  

              Assignee:
              Unassigned
              Reporter:
              Matt Panton
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

                Created:
                Updated: