|
################################################
mongostat messages:
[root@younimsg-mongodb-89 ~]# mongostat --host=10.136.24.88:27032
connected to: 10.136.24.88:27032
insert query update delete getmore command flushes mapped vsize res faults locked % idx miss % qr|qw ar|aw netIn netOut conn set repl time
0 0 1965 0 0 2 0 20g 20.5g 17.6g 0 95.5 0 1|63 1|64 432k 1k 69 younimsgset M 17:12:09
0 0 1954 0 0 3 0 20g 20.5g 17.6g 0 95.6 0 0|63 1|64 430k 1k 69 younimsgset M 17:12:10
0 0 1979 0 0 2 0 20g 20.5g 17.6g 0 95.5 0 0|63 1|64 435k 1k 69 younimsgset M 17:12:11
0 0 1958 0 0 2 0 20g 20.5g 17.6g 0 95.7 0 0|63 1|64 430k 1k 69 younimsgset M 17:12:12
0 0 1970 0 0 2 0 20g 20.5g 17.6g 0 95.6 0 0|63 1|64 433k 1k 69 younimsgset M 17:12:13
0 0 1963 0 0 2 0 20g 20.5g 17.6g 0 95.6 0 0|63 1|64 432k 1k 69 younimsgset M 17:12:14
0 0 1970 0 0 3 0 20g 20.5g 17.6g 0 95.6 0 0|63 1|64 433k 1k 69 younimsgset M 17:12:15
0 0 1969 0 0 2 0 20g 20.5g 17.6g 0 95.6 0 0|63 1|64 432k 1k 69 younimsgset M 17:12:16
0 0 1962 0 0 2 0 20g 20.5g 17.6g 0 95.6 0 0|63 1|64 431k 1k 69 younimsgset M 17:12:17
################################################
master node messages:
PRIMARY> rs.status()
{
"set" : "younimsgset",
"date" : ISODate("2012-08-28T09:12:52Z"),
"myState" : 1,
"members" : [
{
"_id" : 0,
"name" : "10.136.24.88:27032",
"health" : 1,
"state" : 1,
"stateStr" : "PRIMARY",
"optime" :
{
"t" : 1346145172000,
"i" : 1230
}
,
"optimeDate" : ISODate("2012-08-28T09:12:52Z"),
"self" : true
},
{
"_id" : 2,
"name" : "10.136.24.89:37032",
"health" : 1,
"state" : 7,
"stateStr" : "ARBITER",
"uptime" : 88389,
"optime" :
{
"t" : 0,
"i" : 0
}
,
"optimeDate" : ISODate("1970-01-01T00:00:00Z"),
"lastHeartbeat" : ISODate("2012-08-28T09:12:51Z"),
"pingMs" : 0
},
{
"_id" : 3,
"name" : "10.136.24.89:27032",
"health" : 1,
"state" : 2,
"stateStr" : "SECONDARY",
"uptime" : 88301,
"optime" :
{
"t" : 1346139793000,
"i" : 970
}
,
"optimeDate" : ISODate("2012-08-28T07:43:13Z"),
"lastHeartbeat" : ISODate("2012-08-28T09:12:51Z"),
"pingMs" : 0
}
],
"ok" : 1
}
PRIMARY> db.printReplicationInfo()
configured oplog size: 10000MB
log length start to end: 23900secs (6.64hrs)
oplog first event time: Tue Aug 28 2012 10:34:40 GMT+0800 (CST)
oplog last event time: Tue Aug 28 2012 17:13:00 GMT+0800 (CST)
now: Tue Aug 28 2012 17:13:00 GMT+0800 (CST)
PRIMARY> db.printSlaveReplicationInfo()
source: 10.136.24.89:37032
syncedTo: Thu Jan 01 1970 08:00:00 GMT+0800 (CST)
= 1346145186 secs ago (373929.22hrs)
source: 10.136.24.89:27032
syncedTo: Tue Aug 28 2012 15:43:13 GMT+0800 (CST)
= 5393 secs ago (1.5hrs)
################################################
secondary node messages:
SECONDARY> rs.status()
{
"set" : "younimsgset",
"date" : ISODate("2012-08-28T09:14:17Z"),
"myState" : 2,
"syncingTo" : "10.136.24.88:27032",
"members" : [
{
"_id" : 0,
"name" : "10.136.24.88:27032",
"health" : 1,
"state" : 1,
"stateStr" : "PRIMARY",
"uptime" : 88376,
"optime" :
{
"t" : 1346145255000,
"i" : 1245
}
,
"optimeDate" : ISODate("2012-08-28T09:14:15Z"),
"lastHeartbeat" : ISODate("2012-08-28T09:14:15Z"),
"pingMs" : 0
},
{
"_id" : 2,
"name" : "10.136.24.89:37032",
"health" : 1,
"state" : 7,
"stateStr" : "ARBITER",
"uptime" : 88376,
"optime" :
{
"t" : 0,
"i" : 0
}
,
"optimeDate" : ISODate("1970-01-01T00:00:00Z"),
"lastHeartbeat" : ISODate("2012-08-28T09:14:15Z"),
"pingMs" : 0
},
{
"_id" : 3,
"name" : "10.136.24.89:27032",
"health" : 1,
"state" : 2,
"stateStr" : "SECONDARY",
"optime" :
{
"t" : 1346139793000,
"i" : 970
}
,
"optimeDate" : ISODate("2012-08-28T07:43:13Z"),
"self" : true
}
],
"ok" : 1
}
SECONDARY> db.printReplicationInfo()
configured oplog size: 10000MB
log length start to end: 24108secs (6.7hrs)
oplog first event time: Tue Aug 28 2012 09:01:25 GMT+0800 (CST)
oplog last event time: Tue Aug 28 2012 15:43:13 GMT+0800 (CST)
now: Tue Aug 28 2012 17:14:20 GMT+0800 (CST)
SECONDARY> db.printSlaveReplicationInfo()
source: 10.136.24.89:37032
syncedTo: Thu Jan 01 1970 08:00:00 GMT+0800 (CST)
= 1346145262 secs ago (373929.24hrs)
source: 10.136.24.89:27032
syncedTo: Tue Aug 28 2012 15:43:13 GMT+0800 (CST)
= 5469 secs ago (1.52hrs)
################################################
maseter node log:
Tue Aug 28 17:11:29 [initandlisten] connection accepted from 10.136.24.89:40425 #208968
Tue Aug 28 17:11:30 [conn204217] getmore local.oplog.rs query: { ts:
{ $gte: new Date(5781371733873262939) }
} cursorid:879668613180316522 nreturned:17974 re
slen:4194382 679383ms
Tue Aug 28 17:11:30 [conn204217] SocketException handling request, closing client connection: 9001 socket exception [2] server [10.136.24.89:31540]
Tue Aug 28 17:11:47 [initandlisten] connection accepted from 10.136.24.89:40426 #208969
################################################
secondary node log:
Tue Aug 28 17:10:06 [conn6250] end connection 127.0.0.1:8040
Tue Aug 28 17:10:11 [rsSync] Socket recv() timeout 10.136.24.88:27032
Tue Aug 28 17:10:11 [rsSync] SocketException: remote: 10.136.24.88:27032 error: 9001 socket exception [3] server [10.136.24.88:27032]
Tue Aug 28 17:10:11 [rsSync] replSet syncThread: 10278 dbclient error communicating with server: 10.136.24.88:27032
Tue Aug 28 17:10:20 [conn6248] end connection 10.136.24.89:32296
################################################
Thanks for your reply! Please check the above messages that i posted. Any suggestions is welcome.
|
|
the "9001 socket exception" error is reporting one time per ten minutes on secondary node.
Mon Aug 27 13:40:40 [rsSync] SocketException: remote: 10.136.24.88:27032 error: 9001 socket exception [3] server [10.136.24.88:27032]
Mon Aug 27 13:40:40 [rsSync] DBClientCursor::init call() failed
Mon Aug 27 13:40:41 [rsSync] replSet syncing to: 10.136.24.88:27032
Mon Aug 27 13:41:01 [conn3980] end connection 10.136.24.88:15884
............
Mon Aug 27 13:50:41 [rsSync] SocketException: remote: 10.136.24.88:27032 error: 9001 socket exception [3] server [10.136.24.88:27032]
Mon Aug 27 13:50:41 [rsSync] DBClientCursor::init call() failed
Mon Aug 27 13:50:42 [rsSync] replSet syncing to: 10.136.24.88:27032
Mon Aug 27 13:51:01 [conn4020] end connection 10.136.24.88:15926
Master node error:
Mon Aug 27 15:00:04 [conn201151] query local.oplog.rs query: { ts:
{ $gte: new Date(5781165171716130065) }
} nscanned:102 nreturned:101 reslen:23531 11376977ms
Mon Aug 27 15:00:04 [conn201151] SocketException handling request, closing client connection: 9001 socket exception [2] server [10.136.24.100:21728]
Mon Aug 27 15:00:07 [conn201194] query local.oplog.rs query: { ts:
{ $gte: new Date(5781165171716130065) }
} nscanned:102 nreturned:101 reslen:23531 10778593ms
Mon Aug 27 15:00:07 [conn201194] SocketException handling request, closing client connection: 9001 socket exception [2] server [10.136.24.100:21769]
Mon Aug 27 15:00:07 [conn201237] query local.oplog.rs query: { ts:
{ $gte: new Date(5781165171716130065) }
} nscanned:102 nreturned:101 reslen:23531 10177833ms
Mon Aug 27 15:00:07 [conn201237] SocketException handling request, closing client connection: 9001 socket exception [2] server [10.136.24.100:21810]
Mon Aug 27 15:00:08 [conn201280] query local.oplog.rs query: { ts:
{ $gte: new Date(5781165171716130065) }
} nscanned:102 nreturned:101 reslen:23531 9577827ms
Mon Aug 27 15:00:08 [conn201280] SocketException handling request, closing client connection: 9001 socket exception [2] server [10.136.24.100:21851]
.......................
Mon Aug 27 15:56:32 [conn202100] Assertion: 15889:FindingStartCursor::recoverFromYield() failed to recover
Mon Aug 27 15:56:32 [conn202100] assertion 15889 FindingStartCursor::recoverFromYield() failed to recover ns:local.oplog.rs query:{ ts:
{ $gte: new Date(5781165171716130065) }
}
Mon Aug 27 15:56:32 [conn202100] query local.oplog.rs query: { ts:
{ $gte: new Date(5781165171716130065) }
} exception: FindingStartCursor::recoverFromYield() failed to recover code:15889 reslen:102 1543311ms
Mon Aug 27 15:56:32 [conn202100] SocketException handling request, closing client connection: 9001 socket exception [2] server [10.136.24.100:22645]
Mon Aug 27 15:56:36 [conn202211] Assertion: 15889:FindingStartCursor::recoverFromYield() failed to recover
Mon Aug 27 15:56:36 [conn202211] assertion 15889 FindingStartCursor::recoverFromYield() failed to recover ns:local.oplog.rs query:{ ts:
{ $gte: new Date(5781165171716130065) }
}
Mon Aug 27 15:56:36 [conn202211] query local.oplog.rs query: { ts:
{ $gte: new Date(5781165171716130065) }
} exception: FindingStartCursor::recoverFromYield() failed to recover code:15889 reslen:102 10149ms
Mon Aug 27 15:56:36 [conn202211] end connection 10.136.24.100:22752
Mon Aug 27 15:56:36 [conn202210] end connection 10.136.24.89:27979
Mon Aug 27 15:56:36 [initandlisten] connection accepted from 10.136.24.89:27981 #202213
Mon Aug 27 15:56:41 [conn202057] Assertion: 15889:FindingStartCursor::recoverFromYield() failed to recover
Mon Aug 27 15:56:41 [conn202057] assertion 15889 FindingStartCursor::recoverFromYield() failed to recover ns:local.oplog.rs query:{ ts:
{ $gte: new Date(5781165171716130065) }
}
Mon Aug 27 15:56:41 [conn202057] query local.oplog.rs query: { ts:
{ $gte: new Date(5781165171716130065) }
} exception: FindingStartCursor::recoverFromYield() failed to recover code:15889 reslen:102 2153040ms
Mon Aug 27 15:56:41 [conn202057] SocketException handling request, closing client connection: 9001 socket exception [2] server [10.136.24.100:22602]
Mon Aug 27 15:56:47 [initandlisten] connection accepted from 10.136.24.100:22755 #202214
|