[SERVER-36856] The continuous stepdown thread may execute stepdown after it was paused Created: 24/Aug/18  Updated: 06/Dec/22

Status: Backlog
Project: Core Server
Component/s: Testing Infrastructure
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Yves Duhem Assignee: Backlog - Server Tooling and Methods (STM) (Inactive)
Resolution: Unresolved Votes: 0
Labels: tig-bfday-eligible, tig-resmoke
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Backports
Related
Assigned Teams:
Server Tooling & Methods
Operating System: ALL
Backport Requested:
v4.0, v3.6
Participants:
Linked BF Score: 15
Story Points: 3

 Description   

The stepdown thread can execute the stepdown if the resumed event is cleared by the hook just after the thread ran the wait() on it. 

  1. The stepdown thread checks if it has to wait: checks _is_resumed_evt which is set, so it continues
  2. The hook calls pause()
    1. In pause(): the _is_resumed_evt is cleared
    2. In pause(): the hook waits on _is_idle_evt which it already set so it continues
  3. The stepdown thread runs _step_down_all()


 Comments   
Comment by Steven Vannelli [ 10/May/22 ]

Moving this ticket to the Backlog and removing the "Backlog" fixVersion as per our latest policy for using fixVersions.

Comment by Max Hirschhorn [ 30/Aug/18 ]

We should move the MongoDB-specific logic of running the replSetStepDown command, etc. out to separate class/methods to make the lifecycle of the thread more testable.

Generated at Thu Feb 08 04:44:17 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.