[SERVER-39359] Stepdown thread should error if mongod process exits abnormally on shutdown Created: 02/Feb/19  Updated: 06/Dec/22

Status: Backlog
Project: Core Server
Component/s: Testing Infrastructure
Affects Version/s: None
Fix Version/s: None

Type: Improvement Priority: Minor - P4
Reporter: Max Hirschhorn Assignee: Backlog - Server Tooling and Methods (STM) (Inactive)
Resolution: Unresolved Votes: 0
Labels: tig-resmoke
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
Assigned Teams:
Server Tooling & Methods
Participants:
Linked BF Score: 16

 Description   

The Python version of the stepdown thread in resmoke.py has a mode where it'll send a SIGTERM to the current primary in order to exercise the clean shutdown and startup recovery codepaths of replication. There's already logic to have the stepdown thread exit (and thus later cause the test to be marked as having failed) if a mongod process within the replica set has already exited abnormally when trying to step down the current primary.

However, the return code from calling primary.mongod.wait() after signaling the current primary to exit is ignored. We should instead raise an errors.ServerFailure exception if after sending a SIGTERM to the mongod process it exited with a non-zero return code. Note that we'll still want to ignore the non-zero return code that's expected after sending a SIGKILL to the mongod process (as happens in the replica_sets_kill_primary_jscore_passthrough.yml test suite).



 Comments   
Comment by Steven Vannelli [ 10/May/22 ]

Moving this ticket to the Backlog and removing the "Backlog" fixVersion as per our latest policy for using fixVersions.

Generated at Thu Feb 08 04:51:47 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.