[SERVER-54738] Calls to ServerDiscoveryMonitor::requestImmediateCheck should be throttled Created: 23/Feb/21  Updated: 11/May/23

Status: Backlog
Project: Core Server
Component/s: Internal Code, Networking
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Matthew Saltz (Inactive) Assignee: Lamont Nelson
Resolution: Unresolved Votes: 0
Labels: sharding-nyc-subteam2
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
related to SERVER-54739 Race in ServerDiscoveryMonitor::reque... Closed
Assigned Teams:
Sharding NYC
Operating System: ALL
Participants:
Case:
Story Points: 6

 Description   

It's possible that ServerDiscoveryMonitor::requestImmediateCheck can be called so frequently each subsequent request can cancel the previous request before it has a chance to run, leading to none of them ever succeeding.

This flag is supposed to short circuit rescheduling when there's already an outstanding 'hello' request, but that doesn't get set until after the request is actually scheduled, which can happen at a delay from the time requestImmediateCheck is called, so that doesn't help us in this case.

Note that this applies to both 4.4 and master so we should make sure any fix is backportable.

 

Acceptance criteria: 

Unit test to demonstrate the problem and add throttle to fix the test. 



 Comments   
Comment by Matthew Saltz (Inactive) [ 01/Jun/21 ]

lamont.nelson Assigning to Sharding NYC since it's RSM related

Generated at Thu Feb 08 05:34:21 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.