[SERVER-28858] Automatic status change of the replica Created: 19/Apr/17  Updated: 19/Apr/17  Resolved: 19/Apr/17

Status: Closed
Project: Core Server
Component/s: Networking, Replication
Affects Version/s: None
Fix Version/s: None

Type: Question Priority: Major - P3
Reporter: Arrate [X] Assignee: Unassigned
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Participants:

 Description   

Hello,

we have a MongoDb replica set setup on three nodes, and we have this in both errorlog. This is an example. All of them at same time in the diferent nodes.

Mar 27 10:31:49

NODO 1
Mar 27 10:31:49 ulpmon01 mongod.27017[1464]: [rsHealthPoll] replSet info ulpmon03.osasunet:27017 is down (or slow to respond):
Mar 27 10:31:49 ulpmon01 mongod.27017[1464]: [rsHealthPoll] replSet member ulpmon03.osasunet:27017 is now in state DOWN
Mar 27 10:31:53 ulpmon01 mongod.27017[1464]: [rsHealthPoll] replSet member ulpmon03.osasunet:27017 is up
Mar 27 10:31:53 ulpmon01 mongod.27017[1464]: [rsHealthPoll] replSet member ulpmon03.osasunet:27017 is now in state SECONDARY

NODO 2
Mar 27 10:31:43 ulpmon02 mongod.27017[1438]: [rsHealthPoll] DBClientCursor::init call() failed
Mar 27 10:31:43 ulpmon02 mongod.27017[1438]: [rsHealthPoll] replSet info ulpmon03.osasunet:27017 is down (or slow to respond):
Mar 27 10:31:43 ulpmon02 mongod.27017[1438]: [rsHealthPoll] replSet member ulpmon03.osasunet:27017 is now in state DOWN
Mar 27 10:31:50 ulpmon02 mongod.27017[1438]: [rsHealthPoll] replset info ulpmon03.osasunet:27017 heartbeat failed, retrying
Mar 27 10:31:53 ulpmon02 mongod.27017[1438]: [rsHealthPoll] replSet member ulpmon03.osasunet:27017 is up
Mar 27 10:31:53 ulpmon02 mongod.27017[1438]: [rsHealthPoll] replSet member ulpmon03.osasunet:27017 is now in state SECONDARY

NODO 3
Mar 27 10:31:53 ulpmon03 mongod.27017[1442]: [rsHealthPoll] replset info ulpmon01.osasunet:27017 thinks that we are down
Mar 27 10:31:53 ulpmon03 mongod.27017[1442]: [rsHealthPoll] replset info ulpmon02.osasunet:27017 thinks that we are down

Regards,

Arrate



 Comments   
Comment by Kelsey Schubert [ 19/Apr/17 ]

Hi Arrate,

Thanks for your report. This behavior suggests some sort of network failure occurred, in response MongoDB executed a failover to maintain availability.

Please note that SERVER project is for reporting bugs or feature suggestions for the MongoDB server. For MongoDB-related support discussion please post on the mongodb-user group or Stack Overflow with the mongodb tag. A question like this involving more discussion would be best posted on the mongodb-users group. Additionally, I would recommend upgrading to a more recent version as MongoDB 2.6 has reached end of life.

Kind regards,
Thomas

Comment by Arrate [X] [ 19/Apr/17 ]

We have these errors too:

NODO 1
Apr 2 00:11:29 ulpmon01 mongod.27017[1464]: [conn491389] Unauthorized not authorized on admin to execute command

{ replSetHeartbeat: "movdb_rs", v: 3, pv: 1, checkEmpty: false, from: "ulpmon02.osasunet:27017", fromId: 1 }

NODO 2
Apr 2 00:11:24 ulpmon02 mongod.27017[1438]: [rsHealthPoll] couldn't connect to ulpmon01.osasunet:27017: couldn't connect to server ulpmon01.osasunet:27017 (10.70.10.123), connection attempt failed
Apr 2 00:11:29 ulpmon02 mongod.27017[1438]: [rsHealthPoll] couldn't connect to ulpmon01.osasunet:27017: couldn't connect to server ulpmon01.osasunet:27017 (10.70.10.123) failed, connection attempt failed
Apr 2 00:11:29 ulpmon02 mongod.27017[1438]: [rsHealthPoll] replset info ulpmon01.osasunet:27017 heartbeat failed, retrying

NODO 3 - There are no errors.

Generated at Thu Feb 08 04:19:15 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.