Details
-
Question
-
Resolution: Done
-
Trivial - P5
-
None
-
2.6.8
-
None
Description
I am using this check https://github.com/mzupan/nagios-plugin-mongodb/blob/master/check_mongodb.py to monitor replication lag between nodes in a replica set.
From time to time I have observed big lags between nodes (> 1000s).
I was guessing it could be a bug of the check, so I executed a second script obtaining info from the replica set when the check fails, and it shows this:
MongoDB shell version: 2.6.8
|
connecting to: test
|
{
|
"set" : "cyclops",
|
"date" : ISODate("2016-06-02T09:12:38Z"),
|
"myState" : 2,
|
"syncingTo" : "NODE11:27017",
|
"members" : [
|
{
|
"_id" : 0,
|
"name" : "NODE10:27017",
|
"health" : 1,
|
"state" : 2,
|
"stateStr" : "SECONDARY",
|
"uptime" : 26251817,
|
"optime" : Timestamp(1464858757, 28),
|
"optimeDate" : ISODate("2016-06-02T09:12:37Z"),
|
"self" : true
|
},
|
{
|
"_id" : 1,
|
"name" : "NODE11:27017",
|
"health" : 1,
|
"state" : 1,
|
"stateStr" : "PRIMARY",
|
"uptime" : 26096062,
|
"optime" : Timestamp(1464858306, 3),
|
"optimeDate" : ISODate("2016-06-02T09:05:06Z"),
|
"lastHeartbeat" : ISODate("2016-06-02T09:12:36Z"),
|
"lastHeartbeatRecv" : ISODate("2016-06-02T09:12:37Z"),
|
"pingMs" : 0,
|
"electionTime" : Timestamp(1438762714, 1),
|
"electionDate" : ISODate("2015-08-05T08:18:34Z")
|
},
|
{
|
"_id" : 2,
|
"name" : "NODE12:27017",
|
"health" : 1,
|
"state" : 2,
|
"stateStr" : "SECONDARY",
|
"uptime" : 26251812,
|
"optime" : Timestamp(1464858306, 3),
|
"optimeDate" : ISODate("2016-06-02T09:05:06Z"),
|
"lastHeartbeat" : ISODate("2016-06-02T09:12:36Z"),
|
"lastHeartbeatRecv" : ISODate("2016-06-02T09:12:37Z"),
|
"pingMs" : 1,
|
"syncingTo" : "NODE011:27017"
|
}
|
],
|
"ok" : 1
|
}
|
The node executing the check:
"optimeDate" : ISODate("2016-06-02T09:12:37Z"),
Primary node:
"optimeDate" : ISODate("2016-06-02T09:05:06Z"),
So lag is -451s!
Both nodes are ntp synced.