-
Type:
Task
-
Resolution: Unresolved
-
Priority:
Major - P3
-
None
-
Affects Version/s: None
-
Component/s: Statistics
-
Storage Engines, Storage Engines - Persistence
-
SE Persistence backlog
-
None
As described in WT-14659, we can return the checkpoint size as the storage size of the underlying file through stats:
The server populates the storageSize reported in collStats using WT's block_size statistic for a table.
With attached storage, this size is simply the size of the underlying file. With disaggregated storage we don't have an underlying file, so we need a different way to provide this information.
We should also continue to return this value via the block_size statistic, as this won't require changes in the server code.
Skimming the disagg code, it looks like we never set block_size for in the disagg block manager, so I assume we are returning zero for this value right now. So the work in this ticket is to connect that statistic to the current checkpoint size. We'll also have to update it on every checkpoint. This work doesn't require WT-14546, but we won't have accurate results until that ticket is completed.
Note: In the current implementation, statistics=(size) is fast-pathed to return the file size without actually opening the corresponding dhandle. My understanding is that we introduced this optimization (which kind of bypasses the block-manager abstraction) because the server requests this information frequently. We should be able to provide a similar optimization with disagg, as the size of the most recent checkpoint is in the checkpoint= entry of a table's metadata. So we can retrieve it without opening a dhandle.
- is related to
-
WT-14546 Account for page deltas in WT checkpoint size statistic
-
- Open
-