-
Type: Bug
-
Resolution: Done
-
Priority: Critical - P2
-
None
-
Affects Version/s: 2.4.6
-
Component/s: Replication
-
None
-
Environment:CentOS
-
Linux
-
We have three member replica set primary+secondary+arbiter across different virtual machines.
To reproduced the issue we power the primary DB. We have observed that failover time is more than 10 secs.
What is optimal failover time. Is there any tuning parameter to reduce the failover time?
Thu Dec 19 02:15:55.994 [rsSyncNotifier] replset setting oplog notifier to sessionmgr01:27717
Thu Dec 19 02:19:09.121 [rsHealthPoll] DBClientCursor::init call() failed
Thu Dec 19 02:19:09.121 [rsHealthPoll] replSet info sessionmgr01:27717 is down (or slow to respond):
Thu Dec 19 02:19:09.121 [rsHealthPoll] replSet member sessionmgr01:27717 is now in state DOWN
Thu Dec 19 02:19:09.122 [rsMgr] replSet info electSelf 2
Thu Dec 19 02:19:17.124 [rsHealthPoll] replset info sessionmgr01:27717 heartbeat failed, retrying
Thu Dec 19 02:19:28.071 [rsBackgroundSync] Socket recv() timeout 192.168.92.59:27717
Thu Dec 19 02:19:28.071 [rsBackgroundSync] SocketException: remote: 192.168.92.59:27717 error: 9001 socket exception [RECV_TIMEOUT] server [192.168.92.59:27717]
Thu Dec 19 02:19:28.072 [rsBackgroundSync] replSet sync source problem: 10278 dbclient error communicating with server: sessionmgr01:27717
Thu Dec 19 02:19:28.072 [rsSyncNotifier] Socket recv() timeout 192.168.92.59:27717
Thu Dec 19 02:19:28.072 [rsSyncNotifier] SocketException: remote: 192.168.92.59:27717 error: 9001 socket exception [RECV_TIMEOUT] server [192.168.92.59:27717]
Thu Dec 19 02:19:28.072 [rsSyncNotifier] replset tracking exception: exception: 10278 dbclient error communicating with server: sessionmgr01:27717
Thu Dec 19 02:19:28.072 [rsMgr] replSet PRIMARY
Thu Dec 19 02:19:29.127 [rsHealthPoll] replset info sessionmgr01:27717 heartbeat failed, retrying
- duplicates
-
SERVER-10225 Replica set failover speed improvement
- Closed