[SERVER-3954] remove and add of replication node gives errors Created: 26/Sep/11  Updated: 29/Feb/12  Resolved: 16/Dec/11

Status: Closed
Project: Core Server
Component/s: Replication
Affects Version/s: 1.8.1
Fix Version/s: None

Type: Question Priority: Major - P3
Reporter: H Kaur Assignee: Kristina Chodorow (Inactive)
Resolution: Incomplete Votes: 0
Labels: replicaset
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

db version v1.8.1, pdfile version 4.5
sys info: Linux bs-linux64.10gen.cc 2.6.21.7-2.ec2.v1.2.fc8xen #1 SMP Fri Nov 20 17:48:28 EST 2009 x86_64 BOOST_LIB_VERSION=1_41
uptime: 7088271 seconds

New DBA doing testing on TEST


Participants:

 Description   

While doing testing for removing and re-adding nodes to replica set i get the following errors ...

Set name: cheggtest
Majority up: yes
Member id Up cctime Last heartbeat Votes Priority State Messages optime skew
mongo-db01.test.cloud.cheggnet.com:27017 (me) 0 1 2e+03 hrs 1 1 PRIMARY 4e80dd7a:1
mongo-db02.test.cloud.cheggnet.com 1 1 26 mins 1 sec ago 1 1 SECONDARY 4e80dd7a:1 1
mongo-db03.test.cloud.cheggnet.com 2 1 26 mins 1 sec ago 1 1 SECONDARY 4e80dd7a:1 1
mongo-db01.test.cloud.cheggnet.com:27117 5 0 1 sec ago 1 1 still initializing 0:0

Removed and added mongo-db01.test.cloud.cheggnet.com:27117 , STATUS: still initializing for past few hours
{
"_id" : 5,
"name" : "mongo-db01.test.cloud.cheggnet.com:27117",
"health" : 0,
"state" : 6,
"stateStr" : "(not reachable/healthy)",
"uptime" : 0,
"optime" :

{ "t" : 0, "i" : 0 }

,
"optimeDate" : ISODate("1970-01-01T00:00:00Z"),
"lastHeartbeat" : ISODate("2011-09-26T19:24:45Z"),
"errmsg" : "still initializing"
}

Removed and added mongo-db02.test.cloud.cheggnet.com:27217 , STATUS: recieved follwoing error
cheggtest:PRIMARY> rs.add("mongo-db02.test.cloud.cheggnet.com:27217");
{
"assertion" : "need most members up to reconfigure, not ok : mongo-db02.test.cloud.cheggnet.com:27217",
"assertionCode" : 13144,
"errmsg" : "db assertion failure",
"ok" : 0
}



 Comments   
Comment by H Kaur [ 16/Dec/11 ]

This issue was resolved. I have a new issue in production and opening a new discussion in google user group.

Comment by Kristina Chodorow (Inactive) [ 16/Dec/11 ]

Can you attach the log from the primary and the member you were trying to add.

Comment by H Kaur [ 16/Dec/11 ]

Kristina,

The issue was resolved on test. However on production when I try to add a member I received the following error.

We have a configuration of 3 nodes in replicaset in East Amazon EC2.

when trying to add additional node in West Amazon EC2 ...keep getting
follwoing error ... confirmed ping to target server from primary works
good.

prod:PRIMARY> rs.add("mongo-dbbk.prod2.cloud.domain.com")
{
"assertion" : "can't find self in new replset config",
"assertionCode" : 13433,
"errmsg" : "db assertion failure",
"ok" : 0
}
prod:PRIMARY> rs.add("mongo-dbbk.prod2.cloud.domain.com:27017")
{
"assertion" : "can't find self in new replset config",
"assertionCode" : 13433,
"errmsg" : "db assertion failure",
"ok" : 0
}

Thanks,
Harpreet

Comment by Kristina Chodorow (Inactive) [ 21/Nov/11 ]

Can you send the log from the primary and the member you were trying to add?

Comment by H Kaur [ 26/Sep/11 ]

here are the commands used to remove and add the 5th node. Not sure why is it complaining "need most members up to reconfigure"

rs.remove("mongo-db02.test.cloud.cheggnet.com:27217")
Mon Sep 26 12:53:36 DBClientCursor::init call() failed
Mon Sep 26 12:53:36 query failed : admin.$cmd { replSetReconfig: { _id: "cheggtest", version: 8, members: [

{ _id: 0, host: "mongo-db01.test.cloud.cheggnet.com:27017" }

,

{ _id: 1, host: "mongo-db02.test.cloud.cheggnet.com" }

,

{ _id: 2, host: "mongo-db03.test.cloud.cheggnet.com" }

,

{ _id: 5, host: "mongo-db01.test.cloud.cheggnet.com:27117" }

] } } to: 127.0.0.1
Mon Sep 26 12:53:36 Error: error doing query: failed shell/collection.js:150
Mon Sep 26 12:53:36 trying reconnect to 127.0.0.1
Mon Sep 26 12:53:36 reconnect 127.0.0.1 ok
cheggtest:PRIMARY> rs.add("mongo-db02.test.cloud.cheggnet.com:27217");
{
"assertion" : "need most members up to reconfigure, not ok : mongo-db02.test.cloud.cheggnet.com:27217",
"assertionCode" : 13144,
"errmsg" : "db assertion failure",
"ok" : 0
}

Generated at Thu Feb 08 03:04:32 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.