-
Type: Improvement
-
Resolution: Fixed
-
Priority: Major - P3
-
Affects Version/s: None
-
Component/s: None
-
None
-
Build
-
Fully Compatible
There are a lot of points in Bazel that may be affected by transient network failures. To avoid causing developers interrupts during the initial migration, add retries around all Bazel invocations in the SCons integration layer. This should help keep users happy while also reducing the interrupt work for us in the short term.
The drawbacks to this should be minimal. If a build fails, it'll very rarely take long to fail again when retried since the build up to the point of failure will be cached. The error log will be slightly longer, but the final failure will still be at the bottom.
Longer term we may want to revisit this after establishing further transient failure mitigations. Since the cost of a transient failure failing a build is high, and the cost of this is low, we may want to continue this approach indefinitely.
This also sets us up for enabling https://jira.mongodb.org/browse/SERVER-87423 since it establishes the retry logic in the integration layer.