We are seeing several different failure modes in stress testing relating to the cache being pinned full - leaving format in a state where it can't make forward progress. We see one of two symptoms from this issue:
eviction-server: Cache stuck for too long, giving up: Connection timed out
format run exceeded 15 minutes past the maximum time, aborting the process.
The cache can be stuck full of either clean or dirty pages. We should characterise the different failures being seen at the moment - and put a plan in place to fix bugs and otherwise stop our stress testing from encountering stuck-cache conditions.