-
Type:
Question
-
Resolution: Done
-
Priority:
Trivial - P5
-
None
-
Affects Version/s: 2.6.8
-
Component/s: Replication
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
I am using this check https://github.com/mzupan/nagios-plugin-mongodb/blob/master/check_mongodb.py to monitor replication lag between nodes in a replica set.
From time to time I have observed big lags between nodes (> 1000s).
I was guessing it could be a bug of the check, so I executed a second script obtaining info from the replica set when the check fails, and it shows this:
MongoDB shell version: 2.6.8
connecting to: test
{
"set" : "cyclops",
"date" : ISODate("2016-06-02T09:12:38Z"),
"myState" : 2,
"syncingTo" : "NODE11:27017",
"members" : [
{
"_id" : 0,
"name" : "NODE10:27017",
"health" : 1,
"state" : 2,
"stateStr" : "SECONDARY",
"uptime" : 26251817,
"optime" : Timestamp(1464858757, 28),
"optimeDate" : ISODate("2016-06-02T09:12:37Z"),
"self" : true
},
{
"_id" : 1,
"name" : "NODE11:27017",
"health" : 1,
"state" : 1,
"stateStr" : "PRIMARY",
"uptime" : 26096062,
"optime" : Timestamp(1464858306, 3),
"optimeDate" : ISODate("2016-06-02T09:05:06Z"),
"lastHeartbeat" : ISODate("2016-06-02T09:12:36Z"),
"lastHeartbeatRecv" : ISODate("2016-06-02T09:12:37Z"),
"pingMs" : 0,
"electionTime" : Timestamp(1438762714, 1),
"electionDate" : ISODate("2015-08-05T08:18:34Z")
},
{
"_id" : 2,
"name" : "NODE12:27017",
"health" : 1,
"state" : 2,
"stateStr" : "SECONDARY",
"uptime" : 26251812,
"optime" : Timestamp(1464858306, 3),
"optimeDate" : ISODate("2016-06-02T09:05:06Z"),
"lastHeartbeat" : ISODate("2016-06-02T09:12:36Z"),
"lastHeartbeatRecv" : ISODate("2016-06-02T09:12:37Z"),
"pingMs" : 1,
"syncingTo" : "NODE011:27017"
}
],
"ok" : 1
}
The node executing the check:
"optimeDate" : ISODate("2016-06-02T09:12:37Z"),
Primary node:
"optimeDate" : ISODate("2016-06-02T09:05:06Z"),
So lag is -451s!
Both nodes are ntp synced.