I received actual database backups. Running on real data shows a few things in the backup analysis tool that could be improved or fixed.
Some potential improvements:
- In terse mode, only show files with differences. Skip files with no difference.
- The calculation of percentage of blocks that differ by < 20% of the block or > 80% of the block seems suspect.
- For most files, 100% of the differing blocks differ by < 20% and that is shown in the terse output for each file itself.
- But for the summary of file types (collections, indexes, etc) it shows that only 15-22% of blocks are different by < 20%.
- Either the calculation is wrong or misleading and hard to understand.