See PM-2524 technical design doc. This is similar to work done in WT-9910. Convert test/format to do predictable replay, using/enhancing common support in test_util. Add an evergreen job for it. Add temp bug to test predictability as in WT-9910
As part of this work, add "flush_tier=(enabled)" at configurable intervals to the checkpoint thread. Whether this gets refactored into calls into testutil to run a checkpoint thread or not, will be determined as part of this effort.