[SERVER-52680] Removed node on startup stuck in STARTUP2 after being re-added into the replica set Created: 06/Nov/20  Updated: 29/Oct/23  Resolved: 19/Nov/20

Status: Closed
Project: Core Server
Component/s: Replication
Affects Version/s: None
Fix Version/s: 4.0.22, 3.6.22, 4.4.3, 4.2.12, 5.0.0-rc0

Type: Bug Priority: Major - P3
Reporter: Jason Chan Assignee: A. Jesse Jiryu Davis
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Backports
Depends
Related
related to SERVER-53026 Secondary cannot restart replication Closed
is related to SERVER-33747 Arbiter tries to start data replicati... Closed
Backwards Compatibility: Fully Compatible
Operating System: ALL
Backport Requested:
v4.8, v4.7, v4.4, v4.2, v4.0, v3.6
Sprint: Repl 2020-11-30
Participants:
Linked BF Score: 50

 Description   

SERVER-33747 introduced a change to not start data replication on startup if the node is REMOVED.

However, in our power cycle tests, we do the following:
1. Start a single node replica set on port 20000
2. Restart the node on port 20001
3. Node is unable to find itself in the config on startup, enters REMOVED

4. A reconfig is performed to update the hostname in the rsConfig.
5. Node finds itself in the config, transitions to STARTUP2, and has no means to get out.



 Comments   
Comment by Githook User [ 20/Nov/20 ]

Author:

{'name': 'A. Jesse Jiryu Davis', 'email': 'jesse@mongodb.com', 'username': 'ajdavis'}

Message: SERVER-52680 Start replication when leaving REMOVED state

(cherry picked from commit 73ab98a9094de18b82e596e8d1d0bf311858548b)
Branch: v4.2
https://github.com/mongodb/mongo/commit/f66e8f7ad200d98fcf6b32f4330c7227af7cb517

Comment by Githook User [ 20/Nov/20 ]

Author:

{'name': 'A. Jesse Jiryu Davis', 'email': 'jesse@mongodb.com', 'username': 'ajdavis'}

Message: SERVER-52680 Start replication when leaving REMOVED state

(cherry picked from commit 73ab98a9094de18b82e596e8d1d0bf311858548b)
Branch: v4.4
https://github.com/mongodb/mongo/commit/5b7d39dd67900c9716997c3f9ae13867baffee9d

Comment by Githook User [ 20/Nov/20 ]

Author:

{'name': 'A. Jesse Jiryu Davis', 'email': 'jesse@mongodb.com', 'username': 'ajdavis'}

Message: SERVER-52680 Start replication when leaving REMOVED state

(cherry picked from commit 73ab98a9094de18b82e596e8d1d0bf311858548b)
Branch: v3.6
https://github.com/mongodb/mongo/commit/8bc84de690e1de3cf2755032ac165fc4a3211441

Comment by Githook User [ 20/Nov/20 ]

Author:

{'name': 'A. Jesse Jiryu Davis', 'email': 'jesse@mongodb.com', 'username': 'ajdavis'}

Message: SERVER-52680 Start replication when leaving REMOVED state

(cherry picked from commit 73ab98a9094de18b82e596e8d1d0bf311858548b)
Branch: v4.0
https://github.com/mongodb/mongo/commit/a2d3dc6dc5b878f6806eaeb6810ba121fe24727a

Comment by A. Jesse Jiryu Davis [ 19/Nov/20 ]

This must be backported everywhere that SERVER-33747 has been backported.

Comment by Githook User [ 19/Nov/20 ]

Author:

{'name': 'A. Jesse Jiryu Davis', 'email': 'jesse@mongodb.com', 'username': 'ajdavis'}

Message: SERVER-52680 Start replication when leaving REMOVED state
Branch: master
https://github.com/mongodb/mongo/commit/73ab98a9094de18b82e596e8d1d0bf311858548b

Generated at Thu Feb 08 05:28:42 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.