[SERVER-67938] Consider delaying retries or reducing logging on SnapshotUnavailable in dbcheck Created: 11/Jul/22 Updated: 24/Jul/23 |
|
| Status: | Open |
| Project: | Core Server |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Improvement | Priority: | Major - P3 |
| Reporter: | Yujin Kang Park | Assignee: | Backlog - Replication Team |
| Resolution: | Unresolved | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||
| Assigned Teams: |
Replication
|
||||||||
| Participants: | |||||||||
| Linked BF Score: | 19 | ||||||||
| Description |
|
If dbcheck encounters a SnapshotUnavailable error it immediately retries, and does so an indefinite amount times, logging each retry in the healthlog. We have observed some instances of tests causing more than 10k entries due to this. We might want to consider if we should add a delay between retries or log less frequently. |