Investigate tests that inconsistently succeed and therefore cause build instability
The motivation is twofold:
- Increase team velocity by reducing the development tax paid due to inconsistently succeeding builds. While the failing tests succeed in most variants most of the time, we almost never have a green build (waterfall or patch), and that forces developers to investigate nearly every build. And because developers don't do that consistently, we often have new failures that creep in and go unnoticed for days or weeks. Similarly, fixing flaky tests will allow us to get more value out of continuous matrix testing, as that project relies on consistently green builds in order to be effective.
- Root out any race conditions in the driver that turn out to be the root cause of any of these failures. We investigated one failure in the previous quarter and the root cause turned out to be a race condition in the driver itself. We're concerned that other failing tests are not just due to races in the tests but also in the driver itself and will result in difficult-to-diagnose failures for users in production environments.