[SERVER-14552] Move TopologyCoordinatorImpl::prepareHeartbeatResponse into the ReplicationCoordinator Created: 14/Jul/14  Updated: 10/Dec/14  Resolved: 28/Aug/14

Status: Closed
Project: Core Server
Component/s: Internal Code, Replication
Affects Version/s: None
Fix Version/s: None

Type: Task Priority: Major - P3
Reporter: Spencer Brody (Inactive) Assignee: Unassigned
Resolution: Won't Fix Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Participants:

 Description   

Currently in the replication rewrite we generate responses to heartbeat messages by scheduling a callback in the TopologyCoordinator. This requires 2 context switches to specific threads to be able to reply to a heartbeat. I am concerned that in overloaded systems this could cause heartbeats to time out more readily. Ideally the coordinator would know all it needs to to build the response, then could schedule a callback on the topcoord for it to update whatever state needed to track the fact that a heartbeat was received, but without blocking the thread that received the heartbeat. This would allow heartbeats to be responded to as soon as possible



 Comments   
Comment by Eric Milkie [ 28/Aug/14 ]

You still need some way of getting the information you need out of the TopoCoord (like the sync source), so there will always be some concurrency considerations here.
Also, if the machine is overloaded, technically it should be treated as DOWN, no?

Generated at Thu Feb 08 03:35:13 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.