[SERVER-64023] Update JS tests' retry logic for Shard Merge Created: 26/Feb/22 Updated: 03/Mar/23 Resolved: 03/Mar/23 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Task | Priority: | Major - P3 |
| Reporter: | A. Jesse Jiryu Davis | Assignee: | [DO NOT USE] Backlog - Server Serverless (Inactive) |
| Resolution: | Won't Do | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||
| Assigned Teams: |
Serverless
|
||||||||
| Participants: | |||||||||
| Description |
|
In tenant_migration_util.js, runTenantMigrationCommand with retryOnRetryableErrors=true will retry "not primary" errors on the donor. But Shard Merge isn't resilient to donor failover. Let's examine all retry logic in tenant_migration_util.js and tenant_migration_test.js and adapt it for Shard Merge. This should make BFs easier to diagnose: e.g. if a Shard Merge fails because of donor failover, we want the test to fail right away and log a "not primary" error, instead of timing out after many minutes. |