[SERVER-44307] Add server test to ensure running a replica set node on a port not specified by the rsconfig document does not fire a no-op on step up. Created: 29/Oct/19  Updated: 10/Mar/20  Resolved: 10/Mar/20

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Task Priority: Major - P3
Reporter: Steven Connors (Inactive) Assignee: Judah Schvimer
Resolution: Duplicate Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Duplicate
duplicates SERVER-42200 Update the automated backup restore t... Closed
Sprint: Repl 2019-12-30, Repl 2020-01-13, Repl 2020-01-27, Repl 2020-02-10, Repl 2020-02-24, Repl 2020-03-09, Repl 2020-03-23
Participants:

 Description   

As part of supporting Point in Time restores for 4.2 deployments on Cloud backup, there is an intermediary step to re-start a node as a single node replica set on an ephemeral port, one that is not included in the replica set config document. The reason why the node is started on an ephemeral port is to avoid the no-op that occurs on stepup during the election. This ticket was created to ensure that a server change does not break this behavior, as well as to potentially discuss alternative solutions.

 

Text below taken from comment thread with Siyuan.

 
Some other ideas: 1) update its host:port so that it cannot talk to itself. The node will be in REMOVED state after recovery; 2) update its config to include another non-existent node, so it'll run for election infinitely but never succeeds; 3) make its priority 0, not sure if this works, but this is the most elegant way without significant change; 4) set a very high minValid it cannot reach in recovery, so that this node will stay in RECOVERING state; 5) add a new command to truncate oplog in standalone mode, but backward compatibility will be an issue.



 Comments   
Comment by Judah Schvimer [ 10/Dec/19 ]

We expect this ticket to eventually be closed as a duplicate of SERVER-42200. I'm leaving this open and we'll double check that this is all covered when that is complete.

Comment by Daniel Gottlieb (Inactive) [ 30/Oct/19 ]

If the end goal is to lock in the temporary port behavior, that test should also confirm that any existing oplogTruncateAfterPoint document is processed.

Comment by Carl Champain (Inactive) [ 30/Oct/19 ]

Hi steven.connors,

Passing this ticket along to the Replication team.

Generated at Thu Feb 08 05:05:37 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.