-
Type:
Task
-
Resolution: Unresolved
-
Priority:
Major - P3
-
None
-
Affects Version/s: None
-
Component/s: Checkpoints
-
None
-
Storage Engines - Persistence
-
SE Persistence backlog
-
None
Add checkpoint progress estimation information in the checkpoint progress logs. These could be a separate log or incorporated into the existing log.
There are two levels of granularity to the work done by checkpoint. That is, connection level (for the whole checkpoint) and b-tree level (per table checkpoint).
Some useful existing metrics we could leverage are:
Connection
__wt_cache bytes_dirty_intl bytes_dirty_leaf pages_dirty_intl pages_dirty_leaf
B-tree
bytes_dirty_intl bytes_dirty_leaf bytes_inmem bytes_internal bytes_updates
These would need to be saved at the start of checkpoint to reflect the state of the system at the time of the checkpoint snapshot. We could then track the number of pages written against the total dirty pages both at the connection and btree level.
Another suggestion is to do something similar to compact where an estimate is provided after 1000 pages are written and the estimate is based on the work done by those first 1000 pages.