[SERVER-39882] Powercycle doesn't handle node transitioning from STARTUP to REMOVED state Created: 28/Feb/19  Updated: 06/Dec/22  Resolved: 05/Nov/21

Status: Closed
Project: Core Server
Component/s: Testing Infrastructure
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Max Hirschhorn Assignee: Backlog - Server Tooling and Methods (STM) (Inactive)
Resolution: Won't Fix Votes: 0
Labels: tig-powercycle
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
Assigned Teams:
Server Tooling & Methods
Operating System: ALL
Participants:
Linked BF Score: 46

 Description   

The mongod process will close its connections when it transitions from STARTUP to REMOVED. Powercycle must either be prepared to retry the commands it runs up until it finishes reconfiguring the replica set on pymongo.errors.AutoReconnect exceptions, or it should change to reconfigure the replica set by first brining the node up as a stand-alone mongod.

mongo = pymongo.MongoClient(**mongo_client_opts)
LOGGER.info("Server buildinfo: %s", mongo.admin.command("buildinfo"))
LOGGER.info("Server serverStatus: %s", mongo.admin.command("serverStatus"))
if options.repl_set:
    ret = mongo_reconfig_replication(mongo, host_port, options.repl_set)

# TODO: Rework reconfig logic as follows:
# 1. Start up mongod in standalone
# 2. Delete the config doc
# 3. Stop mongod
# 4. Start mongod
# When reconfiguring the replica set, due to a switch in ports
# it can only be done using force=True, as the node will not come up as Primary.
# The side affect of using force=True are large jumps in the config
# version, which after many reconfigs may exceed the 'int' value.



 Comments   
Comment by Brooke Miller [ 05/Nov/21 ]

We recently added a retry mechanism and have not observed this issue.

Generated at Thu Feb 08 04:53:24 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.