[SERVER-29097] Nodes should update liveness info when they receive a heartbeat response without newer optimes Created: 05/May/17  Updated: 06/Dec/22  Resolved: 19/May/17

Status: Closed
Project: Core Server
Component/s: Replication
Affects Version/s: None
Fix Version/s: None

Type: Improvement Priority: Major - P3
Reporter: Judah Schvimer Assignee: Backlog - Replication Team
Resolution: Duplicate Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
Duplicate
duplicates SERVER-26990 Unify tracking of secondary state bet... Closed
Assigned Teams:
Replication
Backwards Compatibility: Fully Compatible
Participants:
Linked BF Score: 0

 Description   

When a node receives a response to a heartbeat, it marks the node as alive at the same time as it updates its view of that node's optimes. This occurs here. We only update the node's optimes though if the optimes have moved forward. If the node hasn't moved forward, that's no reason to not consider it alive. This can lead to spurious elections by unnecessarily throwing away liveness information.

There is a mismatch here between heartbeat request and response processing. In heartbeat requests, we do always update liveness information.



 Comments   
Comment by Matthew Russotto [ 19/May/17 ]

The merge of slaveinfo with heartbeat data in SERVER-26990 solves this; liveness is now always updated with heartbeats.

Comment by Judah Schvimer [ 05/May/17 ]

This is called here. I'm also not sure if the if statement around _updateOpTimesFromHeartbeat_inlock is correct.

    if (action.getAction() == HeartbeatResponseAction::NoAction && hbStatusResponse.isOK() &&
        targetIndex >= 0 && hbStatusResponse.getValue().hasState() &&
        hbStatusResponse.getValue().getState() != MemberState::RS_PRIMARY) {
        ReplSetHeartbeatResponse hbResp = hbStatusResponse.getValue();
        if (hbResp.hasAppliedOpTime()) {
            if (hbResp.getConfigVersion() == _rsConfig.getConfigVersion()) {
                _updateOpTimesFromHeartbeat_inlock(
                    targetIndex,
                    hbResp.hasDurableOpTime() ? hbResp.getDurableOpTime() : OpTime(),
                    hbResp.getAppliedOpTime());
            }
        }
    }

Generated at Thu Feb 08 04:19:54 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.