[SERVER-23706] replication electable condition Created: 14/Apr/16  Updated: 15/Nov/21  Resolved: 15/Apr/16

Status: Closed
Project: Core Server
Component/s: Replication
Affects Version/s: 3.2.5
Fix Version/s: None

Type: Question Priority: Major - P3
Reporter: Zhang Youdong Assignee: Eric Milkie
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Sprint: Repl 13 (04/22/16)
Participants:

 Description   

In official document: https://docs.mongodb.org/manual/core/replica-set-elections/

 A member will veto an election:
 
If the member seeking an election is not a member of the voter’s set.
If the current primary has more recent operations (i.e. a higher optime) than the member seeking election, from the perspective of another voting member.
If the current primary has the same or more recent operations (i.e. a higher or equal optime) than the member seeking election.

but in the source code: src/mongo/db/repl/topology_coordinator_impl.cpp, which means the node with older oplog(10 seconds within newest) maybe elected as a primary.

bool TopologyCoordinatorImpl::_isOpTimeCloseEnoughToLatestToElect(
    const OpTime& otherOpTime, const OpTime& ourLastOpApplied) const {
    const OpTime latestKnownOpTime = _latestKnownOpTime(ourLastOpApplied);
    // Use addition instead of subtraction to avoid overflow.
    return otherOpTime.getSecs() + 10 >= (latestKnownOpTime.getSecs());
}

I want to know if it's a bug or the document is wrong?



 Comments   
Comment by Eric Milkie [ 15/Apr/16 ]

No problem; thanks for digging in to the source code!

Comment by Zhang Youdong [ 15/Apr/16 ]

@Eric Milkie

Thank you for your reply, I review the code logic and get your point, thank you very much.

Comment by Eric Milkie [ 15/Apr/16 ]

Again, that code you have referenced does not involve electing primaries with "older oplogs". It does, however, provide some of the logic for implementing priority takeover in protocol version 0.
The logic for aborting elections of nodes that are not freshest is contained in the prepareFreshResponse() function, which is called before the code you are quoting above.

Comment by Zhang Youdong [ 15/Apr/16 ]

CmdReplSetElect::run()
    getGlobalReplicationCoordinator()->processReplSetElect()
        ReplicationCoordinatorImpl::_processReplSetElect_finish()
            TopologyCoordinatorImpl::prepareElectResponse()
                TopologyCoordinatorImpl::_getHighestPriorityElectableIndex()
                     TopologyCoordinatorImpl::_getUnelectableReason()
                          bool TopologyCoordinatorImpl::_isOpTimeCloseEnoughToLatestToElect(
                                const OpTime& otherOpTime, const OpTime& ourLastOpApplied) const {
                                const OpTime latestKnownOpTime = _latestKnownOpTime(ourLastOpApplied);
                               // Use addition instead of subtraction to avoid overflow.
                              return otherOpTime.getSecs() + 10 >= (latestKnownOpTime.getSecs());
}

Comment by Eric Milkie [ 14/Apr/16 ]

Please note that the documentation and the code you referenced only apply to protocol version 0, which is not the default for new replica sets in version 3.2.

I believe the answer to your question is that your description of the function you have referenced is incorrect. That function is used for things like priority takeover qualification, but not for actually voting in a given election.

Generated at Thu Feb 08 04:04:15 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.