-
Type:
Task
-
Resolution: Fixed
-
Priority:
Major - P3
-
Affects Version/s: None
-
Component/s: None
-
None
-
Fully Compatible
-
None
-
None
-
None
-
None
-
None
-
None
-
None
Normally the remote asset API is attempted first, which goes through EngFlow's (expensive) elastic load balancer. With --experimental_remote_downloader_local_fallback (a normal bazel flag), Bazel will fallback to the normal backing HTTP url if the remote asset API call fails. However, this doesn't help with cost since the remote asset API still is hit first, contributing to ELB traffic.
This PR adds an option to reverse the order of the attempts, such that the backing URL is attempted first, and the remote asset URL is attempted second.
Originally I thought this wouldn't be possible since the remote asset API is a write-through cache, so if the remote asset API is only hit on failures, the cache won't be warmed up frequently enough to actually provide redundancy.
The hack here is that we can populate the remote asset API's cache with a fraction of the traffic we're currently hitting it with by only routing a small percentage of requests through to the remote asset API on first attempt, and sending the rest through to the backing API by default.
This will keep the cache populated while dramatically reducing traffic transmitted through the ELB.
Test with a broken http_file link:
> Default order — remote downloader first:
> WARNING: Remote Cache: NOT_FOUND: Fetching URL ... with curl failed
> WARNING: Download from https://... failed: 404 Not Found
>
> Reversed order — direct URL first:
> WARNING: Local download failed: ... trying remote downloader
> WARNING: Remote Cache: NOT_FOUND: ...
>
> The "trying remote downloader" message only appears with REVERSE_REMOTE_API_ATTEMPT_ORDER=1, confirming Bazel's custom patch recognizes this env var and flips the attempt order.