[SERVER-56440] Invert tenant migration donor logic for retrying recipient commands Created: 28/Apr/21  Updated: 12/May/21  Resolved: 12/May/21

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Task Priority: Major - P3
Reporter: Jack Mulrow Assignee: Jason Zhang
Resolution: Won't Fix Votes: 0
Labels: pm-1791_non-cloud-blocking, pm-1791_other_required
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Sprint: Sharding 2021-05-17
Participants:

 Description   

Currently tenant migration donor instances will retry commands that fail against the donor if they failed with certain error codes, so an unexpected code will lead the donor to stop retrying. This should be safe because either the command will fail before having made the decision to commit, so the donor can safely abort, or the command will fail during the forget migration logic, which leads the proxy to enter the CLEANUP_FAILED state and notify an operator. For an innocuous error (which I assume an unexpected code is likely to be), this can lead to throwing away a lot of work or unnecessarily requiring manual intervention, so instead the donor should invert its retry logic to always retry except for certain error codes known to be non-retryable.


Generated at Thu Feb 08 05:39:15 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.