[SERVER-30852] Force reconfig that makes current primary unelectable can result in stepdown without taking the global lock Created: 25/Aug/17  Updated: 27/Oct/23  Resolved: 05/Oct/17

Status: Closed
Project: Core Server
Component/s: Replication
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Spencer Brody (Inactive) Assignee: Backlog - Replication Team
Resolution: Gone away Votes: 0
Labels: todo_in_code
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
related to SERVER-28544 Stepdown command must take global loc... Closed
related to SERVER-31431 Ensure that all state transitions in ... Closed
related to SERVER-42553 Complete TODO listed in SERVER-30852 Backlog
related to SERVER-27892 Clarify locking rules for _canAcceptN... Closed
related to SERVER-43452 Complete TODO listed in SERVER-30852 Closed
related to SERVER-44205 Complete TODO listed in SERVER-30852 Closed
related to SERVER-44291 Complete TODO listed in SERVER-30852 Closed
Assigned Teams:
Replication
Operating System: ALL
Participants:
Linked BF Score: 0

 Description   

If a primary learns of a new config via a heartbeat that came in via a force reconfig where it is no longer electable, it will transition itself to SECONDARY, but it does so without holding the global exclusive lock, which is illegal.



 Comments   
Comment by Spencer Brody (Inactive) [ 05/Oct/17 ]

Reading the code again it looks like we actually already handle this properly. There are two possible reconfig paths, the reconfig command and learning of a new config via heartbeats. The reconfig command takes the global X lock for every force reconfig here. Heartbeat reconfigs take the global X lock any time they are currently primary here. So I actually don't think there's anything to do here.

Comment by Spencer Brody (Inactive) [ 25/Aug/17 ]

When we fix this we should update the invariants in TopologyCoordinatorImpl::_setLeaderMode to prevent transitions from LeaderMode::kMaster to kNotLeader without first going through kSteppingDown or kAttemptingStepDown.

Generated at Thu Feb 08 04:25:14 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.