[SERVER-13355] Error if replica set member is started standalone without special flag Created: 26/Mar/14  Updated: 06/Dec/22  Resolved: 05/Oct/18

Status: Closed
Project: Core Server
Component/s: Replication
Affects Version/s: None
Fix Version/s: None

Type: Improvement Priority: Major - P3
Reporter: Aaron Westendorf Assignee: Backlog - Replication Team
Resolution: Won't Fix Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
is related to SERVER-3747 Maintenance mode writability for repl... Backlog
Assigned Teams:
Replication
Participants:

 Description   

Require a config option called "replication.maintenance", or command line arg called "maintenance" instead of "replSet[Name]" to be used to start the member. If neither of these options are used then the member will error and not start with the following message:

18806 Cannot start because replication has been configured but is not currently enabled. Please enable replication and restart

Orig Request
Replica sets configurations are stored in the local database, but only honored if --replSet command line argument is supplied. In an age of automated configuration management, this means that any bug, incident, human error or cosmic ray can result in a mongod restart that takes it out of the replica set, but still allows traffic, resulting in data partitioning. Worse still, there may be hours, days or weeks separating the time at which the error was written to mongo configurations and when the process restarted. As even a few seconds in this state is disastrous, this behavior must be changed.

Any of the following behaviors would be an improvement:

Refuse to start
Without a special command line argument, mongod simply refuses to start if the configuration file or command line arguments differ from the local database. As this may cause problems when trying to perform maintenance that requires mongod to be running, add a --standalone command line argument that will let the process start, but not allow any connections aside from localhost. It is far better to have a dead mongod and let standard replica set algorithms handle failover than to blindly partition the data set.

Honor the local database
Log a warning but always honor the replica set configuration. As not only the set but the hosts are configured in the local database, it is superfluous that the replica set configuration is also partially determined through configuration files or command line arguments, and absurd that said sources trump the local database.

Start in a different state
There already exists states for "not PRIMARY or SECONDARY", and these ensure that no one can use a replica set member until it is ready. Add a state for "INVALIDCONFIG" so that mongod is started, but replication and client connections are rejected until things are fixed.

Anything Else
Anything which does not so easily allow a replica set to be partitioned is necessary. It's absolutely essential that a replica set not be partitioned, and it should take extraordinary measures to forcibly partition the data.



 Comments   
Comment by Gregory McKeon (Inactive) [ 05/Oct/18 ]

We won't fix this due to how backwards-incompatible it is.

Generated at Thu Feb 08 03:31:27 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.