|
Hi Thomas.
1. Would you please upload the complete logs for the primary and secondary? Please be sure to include when this issue occurred, and the subsequent startup logs of the secondary.
secondary log in description.
primary log is
2016-08-08T16:43:55.606+0900 I NETWORK [initandlisten] connection accepted from 127.0.0.1:40147 #4934 (51 connections now open)
2016-08-08T16:43:55.607+0900 I NETWORK [conn4934] end connection 127.0.0.1:40147 (50 connections now open)
2016-08-08T16:43:57.915+0900 I NETWORK [initandlisten] connection accepted from 127.0.0.1:40148 #4935 (51 connections now open)
2016-08-08T16:43:57.916+0900 I NETWORK [conn4935] end connection 127.0.0.1:40148 (50 connections now open)
2016-08-08T16:43:58.223+0900 I NETWORK [initandlisten] connection accepted from 127.0.0.1:40149 #4936 (51 connections now open)
2016-08-08T16:43:58.224+0900 I NETWORK [conn4936] end connection 127.0.0.1:40149 (50 connections now open)
2016-08-08T16:43:58.225+0900 I NETWORK [initandlisten] connection accepted from 127.0.0.1:40150 #4937 (51 connections now open)
2016-08-08T16:43:58.225+0900 I NETWORK [conn4937] end connection 127.0.0.1:40150 (50 connections now open)
2016-08-08T16:43:58.237+0900 I NETWORK [initandlisten] connection accepted from 127.0.0.1:40151 #4938 (51 connections now open)
2016-08-08T16:43:58.237+0900 I NETWORK [conn4938] end connection 127.0.0.1:40151 (50 connections now open)
2016-08-08T16:43:58.820+0900 I NETWORK [conn4889] end connection 127.0.0.1:40099 (49 connections now open)
2016-08-08T16:43:58.826+0900 I NETWORK [conn4884] end connection 127.0.0.1:40094 (48 connections now open)
2016-08-08T16:43:58.840+0900 I NETWORK [conn4919] end connection 127.0.0.1:40131 (47 connections now open)
2016-08-08T16:43:59.852+0900 I REPL [ReplicationExecutor] Member APSEO-SHPS-01:42002 is now in state SECONDARY
2016-08-08T16:44:02.402+0900 I NETWORK [initandlisten] connection accepted from 127.0.0.1:40152 #4939 (48 connections now open)
2016-08-08T16:44:02.402+0900 I NETWORK [conn4939] end connection 127.0.0.1:40152 (47 connections now open)
2016-08-08T16:44:02.403+0900 I NETWORK [initandlisten] connection accepted from 127.0.0.1:40153 #4940 (48 connections now open)
2016-08-08T16:44:23.424+0900 I SHARDING [LockPinger] cluster APSEO-SHPS-01:41001,APSEO-SHPS-01:41002,APSEO-SHPS-01:41003 pinged successfully at 2016-08-08T16:44:23.260+0900 by distributed lock pinger 'APSEO-SHPS-01:41001,APSEO-SHPS-01:41002,APSEO-SHPS-01:41003/APSEO-SHPS-01:42001:1467691717:561486104', sleeping for 30000ms
2016-08-08T16:44:25.400+0900 I NETWORK [initandlisten] connection accepted from 127.0.0.1:40154 #4941 (49 connections now open)
2016-08-08T16:44:35.514+0900 I NETWORK [initandlisten] connection accepted from 127.0.0.1:40165 #4942 (50 connections now open)
2016-08-08T16:44:53.542+0900 I NETWORK [ReplicaSetMonitorWatcher] Detected bad connection created at 1470641453420515 microSec, clearing pool for APSEO-SHPS-01:42002 of 0 connections
2016-08-08T16:44:53.588+0900 I SHARDING [LockPinger] cluster APSEO-SHPS-01:41001,APSEO-SHPS-01:41002,APSEO-SHPS-01:41003 pinged successfully at 2016-08-08T16:44:53.424+0900 by distributed lock pinger 'APSEO-SHPS-01:41001,APSEO-SHPS-01:41002,APSEO-SHPS-01:41003/APSEO-SHPS-01:42001:1467691717:561486104', sleeping for 30000ms
2016-08-08T16:44:53.878+0900 I REPL [ReplicationExecutor] Error in heartbeat request to APSEO-SHPS-01:42002; HostUnreachable: End of file
2016-08-08T16:44:54.119+0900 I REPL [ReplicationExecutor] Error in heartbeat request to APSEO-SHPS-01:42002; HostUnreachable: Connection reset by peer
2016-08-08T16:44:54.120+0900 I REPL [ReplicationExecutor] Error in heartbeat request to APSEO-SHPS-01:42002; HostUnreachable: Connection refused
2016-08-08T16:44:54.120+0900 I NETWORK [conn4940] end connection 127.0.0.1:40153 (49 connections now open)
2016-08-08T16:44:54.120+0900 I NETWORK [conn4942] end connection 127.0.0.1:40165 (48 connections now open)
2016-08-08T16:44:54.955+0900 I NETWORK [conn4920] end connection 127.0.0.1:40132 (47 connections now open)
2016-08-08T16:44:54.955+0900 I NETWORK [conn4883] end connection 127.0.0.1:40090 (47 connections now open)
2016-08-08T16:44:56.120+0900 I REPL [ReplicationExecutor] Error in heartbeat request to APSEO-SHPS-01:42002; HostUnreachable: Connection refused
2016-08-08T16:44:56.121+0900 I REPL [ReplicationExecutor] Error in heartbeat request to APSEO-SHPS-01:42002; HostUnreachable: Connection refused
2016-08-08T16:44:56.121+0900 I REPL [ReplicationExecutor] Error in heartbeat request to APSEO-SHPS-01:42002; HostUnreachable: Connection refused
2016-08-08T16:44:58.122+0900 I REPL [ReplicationExecutor] Error in heartbeat request to APSEO-SHPS-01:42002; HostUnreachable: Connection refused
2016-08-08T16:44:58.123+0900 I REPL [ReplicationExecutor] Error in heartbeat request to APSEO-SHPS-01:42002; HostUnreachable: Connection refused
2016-08-08T16:44:58.123+0900 I REPL [ReplicationExecutor] Error in heartbeat request to APSEO-SHPS-01:42002; HostUnreachable: Connection refused
2016-08-08T16:44:59.004+0900 I NETWORK [conn4879] end connection 127.0.0.1:39989 (45 connections now open)
2. Are you able to consistently reproduce this issue?
yes, the issue lasted until corrected.
|
|
Hi jipark@ea.com,
Thanks for providing additional details, to continue to investigate please answer the following questions.
- Would you please upload the complete logs for the primary and secondary? Please be sure to include when this issue occurred, and the subsequent startup logs of the secondary.
- Are you able to consistently reproduce this issue?
Thank you,
Thomas
|