-
Type: Improvement
-
Resolution: Fixed
-
Priority: Major - P3
-
Affects Version/s: None
-
Component/s: Compaction
-
None
-
Storage Engines
-
(copied to CRM)
-
3
-
c(3x10^8)-StorEng - 2023-11-14, 2023-11-28 - Anthill Tiger
Summary
Each pass compacting a file performs three checkpoints. Currently all of these checkpoints have force=1 set. But only the last of the three requires it. We shouldn't force the first two checkpoints as it can create unnecessary work in the system.
Motivation
As described in session_compact.c, each compact pass on a file performs three checkpoints:
- Before compacting we checkpoint the file to flush any dirty data. This makes any space freed by updating those blocks available for compact to use
- After compacting we perform a checkpoint to persist the changes made by compact
- A final checkpoint, performed immediately after the preceding one ensures that space no longer used at the end of the file shows up in the available extent list, allowing a subsequent file truncation.
In the 1st checkpoint, if there isn't any dirty data, there is no benefit from performing the checkpoint. Similarly with the 2nd checkpoint, if compaction did any work, the tree will be dirty and the checkpoint will happen. So we should only need to force the last checkpoint.
The specific goal here is to avoid the 1st checkpoint if the tree is clean as a force checkpoint on a large (TB+) clean file can take a long time.
NOTE
The work here should also verify that the reasoning above is correct – that compact still works and that there aren't any unexpected downsides.
- related to
-
WT-11945 Provide option force a subset of files during system-wide checkpoint
- Open