[SERVER-54738] Calls to ServerDiscoveryMonitor::requestImmediateCheck should be throttled Created: 23/Feb/21 Updated: 11/May/23 |
|
| Status: | Backlog |
| Project: | Core Server |
| Component/s: | Internal Code, Networking |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Matthew Saltz (Inactive) | Assignee: | Lamont Nelson |
| Resolution: | Unresolved | Votes: | 0 |
| Labels: | sharding-nyc-subteam2 | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||
| Assigned Teams: |
Sharding NYC
|
||||||||
| Operating System: | ALL | ||||||||
| Participants: | |||||||||
| Case: | (copied to CRM) | ||||||||
| Story Points: | 6 | ||||||||
| Description |
|
It's possible that ServerDiscoveryMonitor::requestImmediateCheck can be called so frequently each subsequent request can cancel the previous request before it has a chance to run, leading to none of them ever succeeding. This flag is supposed to short circuit rescheduling when there's already an outstanding 'hello' request, but that doesn't get set until after the request is actually scheduled, which can happen at a delay from the time requestImmediateCheck is called, so that doesn't help us in this case. Note that this applies to both 4.4 and master so we should make sure any fix is backportable.
Acceptance criteria: Unit test to demonstrate the problem and add throttle to fix the test. |
| Comments |
| Comment by Matthew Saltz (Inactive) [ 01/Jun/21 ] |
|
lamont.nelson Assigning to Sharding NYC since it's RSM related |