[SERVER-75074] Reduce the number of retries when reopening buckets Created: 20/Mar/23 Updated: 14/Apr/23 Resolved: 14/Apr/23 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Improvement | Priority: | Major - P3 |
| Reporter: | Fausto Leyva (Inactive) | Assignee: | Fausto Leyva (Inactive) |
| Resolution: | Won't Do | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||
| Assigned Teams: |
Storage Execution
|
||||||||
| Sprint: | Execution Team 2023-05-01 | ||||||||
| Participants: | |||||||||
| Description |
|
In the TS scalability project, we introduced bucket reopening. If we fetch a bucket that might be stale, we re-fetch the bucket and do so without bounds (especially dangerous if we continuously receive WriteConflict errors) see: [write_ops_exec.cpp:insertIntoBucketCatalog].
My suggestion is to cap the number of retries to something like 3. It's a bit arbitrary but I think it could still be beneficial to retry a few times before inserting into a new bucket. Otherwise, we should consider not returning WriteConflicts during the reopening process and insert into a new bucket upon a single reopening failure.
For context, this was surfaced through |