[SERVER-67938] Consider delaying retries or reducing logging on SnapshotUnavailable in dbcheck Created: 11/Jul/22  Updated: 24/Jul/23

Status: Open
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Improvement Priority: Major - P3
Reporter: Yujin Kang Park Assignee: Backlog - Replication Team
Resolution: Unresolved Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
Related
Assigned Teams:
Replication
Participants:
Linked BF Score: 19

 Description   

If dbcheck encounters a SnapshotUnavailable error it immediately retries, and does so an indefinite amount times, logging each retry in the healthlog. We have observed some instances of tests causing more than 10k entries due to this. We might want to  consider if we should add a delay between retries or log less frequently.


Generated at Thu Feb 08 06:09:27 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.