[SERVER-36746] A failed step down attempt shouldn't unconditionally reset LeaderMode to kMaster Created: 17/Aug/18  Updated: 29/Oct/23  Resolved: 30/Aug/18

Status: Closed
Project: Core Server
Component/s: Replication
Affects Version/s: None
Fix Version/s: 3.6.9, 4.0.3, 4.1.3

Type: Bug Priority: Major - P3
Reporter: William Schultz (Inactive) Assignee: Spencer Brody (Inactive)
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: File draining_primary_accepts_writes.js     File draining_primary_accepts_writes_fassert.js    
Issue Links:
Backports
Depends
Related
related to SERVER-31330 Stepping down while exiting drain mod... Closed
Backwards Compatibility: Fully Compatible
Operating System: ALL
Backport Requested:
v4.0, v3.6
Sprint: Repl 2018-09-10
Participants:
Linked BF Score: 0

 Description   

When a node initiates a step down attempt, it will execute the ReplicationCoordinatorImpl::stepDown method. This method executes a number of pre-condition checks and waits to acquire the global lock. Then, it will prepare for the step down attempt and set the current leader mode to kAttemptingStepDown. Next, it sets up logic to run if the stepdown attempt fails. If the attempt fails, we will trigger this block and set our current leader mode to kMaster. After doing this, we will call updateMemberStateFromTopologyCoordinator_inlock, which, if our leader mode is currently kMaster will set the canAcceptNonLocalWrites flag to true. This is incorrect, though, since when we started our step down attempt, there is no guarantee that we were actually in leader mode kMaster. We check that our current member state is PRIMARY, but we may have still been in drain mode, and therefore unable to accept writes, meaning our leader mode would have still been in kLeaderElect. If we fail the step down attempt and don't actually step down, we shouldn't automatically set the leader mode to kMaster. We should leave ourselves in kLeaderElect if that's the state we were in when we started the stepdown attempt.



 Comments   
Comment by Githook User [ 24/Sep/18 ]

Author:

{'name': 'Spencer T Brody', 'email': 'spencer@mongodb.com', 'username': 'stbrody'}

Message: SERVER-36746 Don't erroneously skip drain mode if a stepdown attempt fails.

(cherry picked from commit 0db56f8e29330894a7b24ee2b3f68f8f79e14848)
Branch: v3.6
https://github.com/mongodb/mongo/commit/2cce89ba16cae743675c258ecafa890debea4af4

Comment by Githook User [ 17/Sep/18 ]

Author:

{'name': 'Spencer T Brody', 'email': 'spencer@mongodb.com', 'username': 'stbrody'}

Message: SERVER-36746 Don't erroneously skip drain mode if a stepdown attempt fails.

(cherry picked from commit 0db56f8e29330894a7b24ee2b3f68f8f79e14848)
Branch: v4.0
https://github.com/mongodb/mongo/commit/03fbfd263535c5a2ebe6cf0d0dce3c50195fc43d

Comment by Githook User [ 30/Aug/18 ]

Author:

{'name': 'Spencer T Brody', 'email': 'spencer@mongodb.com', 'username': 'stbrody'}

Message: SERVER-36746 Don't erroneously skip drain mode if a stepdown attempt fails.
Branch: master
https://github.com/mongodb/mongo/commit/0db56f8e29330894a7b24ee2b3f68f8f79e14848

Comment by William Schultz (Inactive) [ 17/Aug/18 ]

It can also cause an fassert, if you actually write to the primary while it's in drain mode (which is allowed), and then let it drain the rest of its ops:

d31021| 2018-08-17T17:01:40.140-0400 F -        [rsSync-0] Fatal assertion 34361 OplogOutOfOrder: Attempted to apply an oplog entry ({ ts: Timestamp(1534539682, 1), t: 1 }) which is not greater than our last applied OpTime ({ ts: Timestamp(1534539700, 1), t: 2 }). at src/mongo/db/repl/sync_tail.cpp 811
d31021| 2018-08-17T17:01:40.140-0400 F -        [rsSync-0]
d31021|
d31021| ***aborting after fassert() failure

See other repro: draining_primary_accepts_writes_fassert.js .

Comment by William Schultz (Inactive) [ 17/Aug/18 ]

Actually, I realized that this can cause an invariant failure on master, since it allows us to erroneously set _canAcceptNonLocalWrites=true when we are in drain mode. When we try to exit drain mode, the invariant fails. See the attached repro draining_primary_accepts_writes.js .

Comment by William Schultz (Inactive) [ 17/Aug/18 ]

On 3.6, this can lead to an invariant failure (see BF-10264), because we are allowed to exit drain mode even if we are currently in leader mode kMaster. On 4.0 and later we don't allow this, so this invariant failure should be avoided. It is still incorrect to be setting canAcceptNonLocalWrites to true, though, if we haven't actually completed draining.

Generated at Thu Feb 08 04:43:58 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.