-
Type:
Epic
-
Resolution: Done
-
Priority:
Major - P3
-
None
-
Affects Version/s: None
-
Component/s: Checkpoints
-
Storage Engines - Persistence
-
3,114.925
-
SE Persistence backlog
-
None
-
Improve checkpoint observability
The work done WT-17280 should give us an idea of what we need to do next to be better at understanding poor performing checkpoints. This is a placeholder to capture the work that needs to be done, example of questions we should ask ourselves:
- Do we need to add more checkpoint states?
- Do we need to backport
WT-12657to older WT versions so we can remove the checkpoint cleanup stage from checkpoint? - Are we missing logs?
- Are we handling errors correctly and printing a relevant message when hitting these paths?
- is related to
-
WT-16700 Create a dashboard to track checkpoint cleanup performance
-
- In Progress
-
-
WT-16788 Create dashboard(s) to find out which checkpoint states mostly impact long checkpoints
-
- Closed
-
-
WT-17280 Observe checkpoint perf in the fleet
-
- Open
-
-
WT-12657 Add checkpoint cleanup utility thread
-
- Closed
-