Details
-
Task
-
Resolution: Fixed
-
Major - P3
-
None
-
None
Description
Hi Team,
We have seen a number of questions around the behaviour of WT block mgmt, checkpoints and fragmentation.
From the WiredTiger documentation:
By default, when file blocks are being reused, WiredTiger attempts to avoid file fragmentation by selecting the smallest available block rather than splitting a larger available block into two. The block_allocation configuration string to WT_SESSION::create can be set to first to change the algorithm to first-fit, that is, take the first available block in the file. Applications where file size is more of an issue than file fragmentation (for example, applications with fixed-size blocks) might want to configure this way.
What is not clear from our documentation is that when massive deletes occur, the space may not be available from the operating system perspective as the WiredTiger blocks need to be relocated (ie the customer will still need to run compact) in order to have enough space at the end of the file for the truncation (making the operating system know the space is available). Until then, the space is available internally and reflected in the bytes available for reuse field within the collection stats.
This causing confusion among customers. It would be helpful if a clarification on this could be added to the WT Storage FAQ, it may also be beneficial to link to this information from the WT Storage Engine page and the add shard page as the behaviour is similar following a large scale chunk migration.
Thanks
Barry