[SERVER-64858] Unable to remove unreachable members from the replica set. Created: 24/Mar/22  Updated: 16/May/22  Resolved: 16/May/22

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Aasawari Sahasrabuddhe Assignee: Eric Sedor
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Operating System: ALL
Steps To Reproduce:
  1. Login to any MongoDB pod in a kubernetes environment: 
    kubectl exec it <pod-name> - namespace mongo
  2. Create a replica set with a primary member. 
  3. add an unreachable node
    rs.add("mongo-1-mongo:27017")
    rs.add("mongo-2-mongo:27017")

{ { "_id" : 7, "name" : "mongo-1-mongo:27017", "health" : 0, "state" : 8, "stateStr" : "(not reachable/healthy)", "uptime" : 0, "optime" : { "ts" : Timestamp(0, 0), "t" : NumberLong(-1) }, "optimeDurable" : { "ts" : Timestamp(0, 0), "t" : NumberLong(-1) }, "optimeDate" : ISODate("1970-01-01T00:00:00Z"), "optimeDurableDate" : ISODate("1970-01-01T00:00:00Z"), "lastAppliedWallTime" : ISODate("1970-01-01T00:00:00Z"), "lastDurableWallTime" : ISODate("1970-01-01T00:00:00Z"), "lastHeartbeat" : ISODate("2022-03-24T10:04:04.439Z"), "lastHeartbeatRecv" : ISODate("1970-01-01T00:00:00Z"), "pingMs" : NumberLong(0), "lastHeartbeatMessage" : "Error connecting to mongo-1-mongo:27017 :: caused by :: Could not find address for mongo-1-mongo:27017: SocketException: Host not found (authoritative)", "syncSourceHost" : "", "syncSourceId" : -1, "infoMessage" : "", "configVersion" : -1, "configTerm" : -1 }, { "_id" : 8, "name" : "mongo-2-mongo:27017", "health" : 0, "state" : 8, "stateStr" : "(not reachable/healthy)", "uptime" : 0, "optime" : { "ts" : Timestamp(0, 0), "t" : NumberLong(-1) }, "optimeDurable" : { "ts" : Timestamp(0, 0), "t" : NumberLong(-1) }, "optimeDate" : ISODate("1970-01-01T00:00:00Z"), "optimeDurableDate" : ISODate("1970-01-01T00:00:00Z"), "lastAppliedWallTime" : ISODate("1970-01-01T00:00:00Z"), "lastDurableWallTime" : ISODate("1970-01-01T00:00:00Z"), "lastHeartbeat" : ISODate("2022-03-24T10:04:04.433Z"), "lastHeartbeatRecv" : ISODate("1970-01-01T00:00:00Z"), "pingMs" : NumberLong(0), "lastHeartbeatMessage" : "Error connecting to mongo-2-mongo:27017 :: caused by :: Could not find address for mongo-2-mongo:27017: SocketException: Host not found (authoritative)", "syncSourceHost" : "", "syncSourceId" : -1, "infoMessage" : "", "configVersion" : -1, "configTerm" : -1 } ],

  1. The two nodes should be unreachable nodes and then do
    rs.remove("mongo-1-mongo:27017")
    rs.remove("mongo-2-mongo:27017")

 
rs0:PRIMARY> rs.remove("mongo-2-mongo:27017"){ "ok" : 0, "errmsg" : "Cannot provide newlyAdded field to member config during reconfig.", "code" : 93, "codeName" : "InvalidReplicaSetConfig", "$clusterTime" : { "clusterTime" : Timestamp(1648116296, 1), "signature" : { "hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="), "keyId" : NumberLong(0) } }, "operationTime" : Timestamp(1648116296, 1)}
 
rs0:PRIMARY> rs.remove("mongo-1-mongo:27017"){ "ok" : 0, "errmsg" : "Cannot provide newlyAdded field to member config during reconfig.", "code" : 93, "codeName" : "InvalidReplicaSetConfig", "$clusterTime" : { "clusterTime" : Timestamp(1648116306, 1), "signature" : { "hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="), "keyId" : NumberLong(0) } }, "operationTime" : Timestamp(1648116306, 1)}
 
rs0:PRIMARY>

Participants:

 Description   

A replica set deployed in an kubernetes environment, and tried to add two nodes which are unreachable were and added, but rs.remove() does not remove it once when logged in inside pod using mongo.

 

 

conf = {
    _id: "replset",
    members: [
        {_id: 0, host: "localhost:27017"}
    ]
}
rs.initiate(conf)// Adding and removing one nonexistent secondary should work
assert.commandWorked(rs.add("localhost:28018"))
assert.commandWorked(rs.remove("localhost:28018"))// Adding and removing two nonexistent secondaries does not work
assert.commandWorked(rs.add("localhost:28018"))
assert.commandWorked(rs.add("localhost:29019"))
assert.commandWorked(rs.remove("localhost:28018"))
assert.commandWorked(rs.remove("localhost:29019"))

Note: This does not depend on the k8 environment and this issue doesn't seem to affect mongosh but only the mongo shell. 

I tested this on MongoDB v5.0.7.



 Comments   
Comment by Eric Sedor [ 16/May/22 ]

Thanks aasawari.sahasrabuddhe@mongodb.com. Since the mongo shell is deprecated and this works in mongosh, I'm going to close out this ticket. We really recommend that users move to mongosh wherever possible.

Comment by Aasawari Sahasrabuddhe [ 16/May/22 ]

Hi Eric,

Apologies for the delay. I have uploaded the requested files.

The nodes are unreachable because I created a new replica set and added two nodes that do not exist.

Please note that this seems to affect only the mongo shell and does not seem to affect mongosh.

Comment by Eric Sedor [ 11/May/22 ]

Hi aasawari.sahasrabuddhe@mongodb.com,

I've created a secure upload portal for you. Files uploaded to this portal are hosted on Box, are visible only to MongoDB employees, and are routinely deleted after some time.

Can you perform a reproduction of this and then, for each node in the replica set, would you please archive (tar or zip) and upload to that link:

  • the mongod logs
  • the $dbpath/diagnostic.data directory (the contents are described here)

Can you also clarify how the nodes have been made to be "unreachable"?

Comment by Aasawari Sahasrabuddhe [ 24/Mar/22 ]

The rs.remove() does work when logged in to the pod using mongosh
kubectl exec -it -n namespace mongosh
 but doesnt seem to work in mongo. 
Secondly, in mongo mode, the splice and rs.reconfig() has worked for my case. 

Generated at Thu Feb 08 06:01:19 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.