[SERVER-34102] Under PV1, ReplicationCoordinatorImpl::_handleTimePassing for a single node RS should start an election instead of auto-winning. Created: 23/Mar/18 Updated: 29/Oct/23 Resolved: 24/May/18 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Replication |
| Affects Version/s: | None |
| Fix Version/s: | 3.6.6, 4.0.0-rc1, 4.1.1 |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Matthew Russotto | Assignee: | Suganthi Mani |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | neweng | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||
| Operating System: | ALL | ||||||||
| Backport Requested: |
v4.0
|
||||||||
| Sprint: | Repl 2018-05-07, Repl 2018-05-21, Repl 2018-06-04 | ||||||||
| Participants: | |||||||||
| Linked BF Score: | 51 | ||||||||
| Description |
|
Running _performPostMemberStateUpdateAction(kActionWinElection) races with the election timeout in PV1 and can lead to an invariant failure. If we instead use kActionStartSingleNodeElection instead the race will be benign and simply result in a log message about "Not standing for election again; already candidate". |
| Comments |
| Comment by Githook User [ 25/May/18 ] |
|
Author: {'username': 'smani87', 'name': 'Suganthi Mani', 'email': 'suganthi.mani@mongodb.com'}Message: (cherry picked from commit 678947e0836ccf6ebb0e9397e56ada985541bf14) |
| Comment by Githook User [ 24/May/18 ] |
|
Author: {'username': 'smani87', 'name': 'Suganthi Mani', 'email': 'suganthi.mani@mongodb.com'}Message: (cherry picked from commit 678947e0836ccf6ebb0e9397e56ada985541bf14) |
| Comment by Githook User [ 24/May/18 ] |
|
Author: {'username': 'smani87', 'name': 'Suganthi Mani', 'email': 'suganthi.mani@mongodb.com'}Message: |
| Comment by Suganthi Mani [ 11/May/18 ] |
|
Fix mentioned in the description is not sufficient to prevent race between _handleTimePassing (stepdown timeout) and _startElectSelfIfEligibleV1 (election timeout) for pv1 single node replica set case. ReplicationCoordinatorImpl::_handleTimePassing which calls TopologyCoordinator::becomeCandidateIfStepdownPeriodOverAndSingleNodeSet should return false if _role != follower.
|
| Comment by Spencer Brody (Inactive) [ 20/Apr/18 ] |
|
This should be testable with a C++ unit test |