[SERVER-34102] Under PV1, ReplicationCoordinatorImpl::_handleTimePassing for a single node RS should start an election instead of auto-winning. Created: 23/Mar/18  Updated: 29/Oct/23  Resolved: 24/May/18

Status: Closed
Project: Core Server
Component/s: Replication
Affects Version/s: None
Fix Version/s: 3.6.6, 4.0.0-rc1, 4.1.1

Type: Bug Priority: Major - P3
Reporter: Matthew Russotto Assignee: Suganthi Mani
Resolution: Fixed Votes: 0
Labels: neweng
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Backports
Depends
Backwards Compatibility: Fully Compatible
Operating System: ALL
Backport Requested:
v4.0
Sprint: Repl 2018-05-07, Repl 2018-05-21, Repl 2018-06-04
Participants:
Linked BF Score: 51

 Description   

Running _performPostMemberStateUpdateAction(kActionWinElection) races with the election timeout in PV1 and can lead to an invariant failure. If we instead use kActionStartSingleNodeElection instead the race will be benign and simply result in a log message about "Not standing for election again; already candidate".



 Comments   
Comment by Githook User [ 25/May/18 ]

Author:

{'username': 'smani87', 'name': 'Suganthi Mani', 'email': 'suganthi.mani@mongodb.com'}

Message: SERVER-34102 Fix to prevent race between _handleTimePassing (stepdown timeout) and _startElectSelfIfEligibleV1 (election timeout) for pv1 single node replica set case.

(cherry picked from commit 678947e0836ccf6ebb0e9397e56ada985541bf14)
Branch: v3.6
https://github.com/mongodb/mongo/commit/c15b815a6cb1edc8fb2d5555e57fae7e118664b1

Comment by Githook User [ 24/May/18 ]

Author:

{'username': 'smani87', 'name': 'Suganthi Mani', 'email': 'suganthi.mani@mongodb.com'}

Message: SERVER-34102 Fix to prevent race between _handleTimePassing (stepdown timeout) and _startElectSelfIfEligibleV1 (election timeout) for pv1 single node replica set case.

(cherry picked from commit 678947e0836ccf6ebb0e9397e56ada985541bf14)
Branch: v4.0
https://github.com/mongodb/mongo/commit/6fed52516bdc33c9e19c9daa40b01bc12e6519bc

Comment by Githook User [ 24/May/18 ]

Author:

{'username': 'smani87', 'name': 'Suganthi Mani', 'email': 'suganthi.mani@mongodb.com'}

Message: SERVER-34102 Fix to prevent race between _handleTimePassing (stepdown timeout) and _startElectSelfIfEligibleV1 (election timeout) for pv1 single node replica set case.
Branch: master
https://github.com/mongodb/mongo/commit/678947e0836ccf6ebb0e9397e56ada985541bf14

Comment by Suganthi Mani [ 11/May/18 ]

Fix mentioned in the description  is not sufficient to prevent race between _handleTimePassing (stepdown timeout) and _startElectSelfIfEligibleV1 (election timeout) for pv1 single node replica set case. 

ReplicationCoordinatorImpl::_handleTimePassing which calls TopologyCoordinator::becomeCandidateIfStepdownPeriodOverAndSingleNodeSet should return false if _role != follower.

 

Comment by Spencer Brody (Inactive) [ 20/Apr/18 ]

This should be testable with a C++ unit test

Generated at Thu Feb 08 04:35:34 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.