-
Type:
Task
-
Resolution: Unresolved
-
Priority:
Major - P3
-
None
-
Affects Version/s: 8.2.0-rc0
-
Component/s: Replication
-
Replication
-
None
-
3
-
TBD
-
None
-
None
-
None
-
None
-
None
-
None
-
None
There's very little visibility into the aggregate behavior and operation of logical initial syncs today it's very difficult to answer across a large number of clusters:
- What is the normalized throughput of a logical initial sync
- What is the success rate of logical initial syncs
- What is the typical duration of an initial sync
- What phase of logical initial syncs do most failures occur
- How many logical initial syncs are there on a given day
To assist in answering these questions and others we should capture relevant metrics. We can take inspiration from resharding metrics.
When complete it should be possible to build a funnel chart/diagram detailing clusters progress through the end-to-end logical initial sync process and charts detailing the performance of logical initial syncs.