-
Type: Improvement
-
Resolution: Unresolved
-
Priority: Major - P3
-
None
-
Affects Version/s: None
-
Component/s: None
-
None
-
Networking & Observability
After a large network disruption, the RSM will attempt to rediscover hosts whose monitoring connections were severed. In large clusters, this could result in a burst of monitoring connection establishments, which may result in network congestion, DNS server overload, or contention on the RSM's reactor thread. If a randomized delay were used when scheduling the first monitoring request after a previously monitored server became marked as Unknown, it could help to mitigate these issues.