[SERVER-10776] All members in replica set went to STARTUP2 state and remains there. Created: 16/Sep/13  Updated: 10/Dec/14  Resolved: 17/Sep/13

Status: Closed
Project: Core Server
Component/s: Replication
Affects Version/s: 2.4.1
Fix Version/s: None

Type: Improvement Priority: Critical - P2
Reporter: Krishnachaitanya Thummuru Assignee: Unassigned
Resolution: Done Votes: 0
Labels: replication
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Linux (CentOs 5.8)


Attachments: File Replica_set startup2 member logs.rar    
Backwards Compatibility: Fully Compatible
Participants:

 Description   

Hi,
I am facing an issue with one of replica-set members which I created in my environment.
I created replica-set has 5 members, out of 5 one is arbiter and two of the members have priority is 2. Each member is running in individual Linux machine.
All members are running well before shutting down the members. After I bring up one by one member arbiter came up properly as it arbiter state and reaming members of the replica-set are remains at STARTUP2 state.

How can i bring up the members who are all in STARTUP2 to the running state like PRIMARY with SECONDARY states?

Could you please suggest?

Note: All members are running with --nojournal
Attached mongodb logs for your reference

Thanks & Regards
Krishnachaitanya



 Comments   
Comment by Krishnachaitanya Thummuru [ 18/Sep/13 ]

Thanks for your support!

Comment by Daniel Pasette (Inactive) [ 17/Sep/13 ]

If one member has data, the rest of the nodes will sync their data from it. If there is a majority of configured nodes up, then one will become primary and the rest will become secondaries.

Comment by Krishnachaitanya Thummuru [ 17/Sep/13 ]

Hi Dan,

Thanks for update, i think this information is helpful to me.

If any one of the member has sufficient data information and active then rest of the members will come back to normal state or it should have at least one arbiter and primary/secondary?

Please suggest

Comment by Daniel Pasette (Inactive) [ 17/Sep/13 ]

Looks like there are two things going on here.

1. Neither sessionmgr01 nor sessionmgr02 is able to reach sessionmgr04 or sessionmgr03

From sessionmgr01

Fri Sep 13 03:46:06.263 [rsHealthPoll] couldn't connect to sessionmgr04:27717: couldn't connect to server sessionmgr04:27717
Fri Sep 13 03:46:06.263 [rsHealthPoll] couldn't connect to sessionmgr03:27717: couldn't connect to server sessionmgr03:27717

From sessionmgr02

Fri Sep 13 03:50:02.988 [rsHealthPoll] couldn't connect to sessionmgr03:27717: couldn't connect to server sessionmgr03:27717
Fri Sep 13 03:50:02.988 [rsHealthPoll] couldn't connect to sessionmgr04:27717: couldn't connect to server sessionmgr04:27717

2. It appears that you have deleted all the data from all the regular replica set members, except the arbiter. You can see in the logs that they are allocating their local database at startup. Because the arbiter still has a local database which contains the replica set configuration intact, but there is no member with any data, no member can become primary.

The easiest thing for you to do would be remove the data in the dbpath of all your members including the arbiter and re-initialize the set from scratch. You can follow the instructions listed here in the documentation: http://docs.mongodb.org/manual/tutorial/deploy-replica-set/#to-deploy-a-production-replica-set

Generated at Thu Feb 08 03:24:02 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.