[SERVER-63333] Attach retryable error label to TemporarilyUnavailable error code in a Serverless environment Created: 07/Feb/22 Updated: 23/May/23 Resolved: 23/May/23 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Improvement | Priority: | Major - P3 |
| Reporter: | Louis Williams | Assignee: | Backlog - Storage Execution Team |
| Resolution: | Won't Do | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||
| Assigned Teams: |
Storage Execution
|
||||||||
| Participants: | |||||||||
| Description |
|
When run under a Serverless environment, we want to dynamically attach a RetriableError label to the TemporarilyUnavailable error code under the assumption that the higher layers will throttle themselves. |
| Comments |
| Comment by Louis Williams [ 23/May/23 ] |
|
Closing in favor of implementing more comprehensive load-shedding strategy in the server. |
| Comment by Louis Williams [ 03/Mar/22 ] |
|
Pausing work on this until we determine whether |
| Comment by Esha Maharishi (Inactive) [ 24/Feb/22 ] |
|
louis.williams, this matches my understanding. We acknowledged that by attaching RetryableWriteError, drivers will only be able to retry retryable writes, not regular writes, in two places: From theĀ Slack conversation a while ago, where we agreed to do (1) for now, then likely eventually (2):
From theĀ Alternative to transactions larger than cache doc:
matt.broadstone, thanks for documenting the options for improving the driver retry. Just a note I think there was generally interest in having Atlas Proxy/mongos/mongoq do the retries in the long term, so that we don't have to update all drivers (and users don't have to upgrade their drivers). |