-
Type:
Question
-
Resolution: Done
-
Priority:
Trivial - P5
-
None
-
Affects Version/s: 2.6.8
-
Component/s: Replication
-
None
-
None
-
0
-
None
-
None
-
None
-
None
-
None
-
None
I am using this check https://github.com/mzupan/nagios-plugin-mongodb/blob/master/check_mongodb.py to monitor replication lag between nodes in a replica set.
From time to time I have observed big lags between nodes (> 1000s).
I was guessing it could be a bug of the check, so I executed a second script obtaining info from the replica set when the check fails, and it shows this:
MongoDB shell version: 2.6.8 connecting to: test { "set" : "cyclops", "date" : ISODate("2016-06-02T09:12:38Z"), "myState" : 2, "syncingTo" : "NODE11:27017", "members" : [ { "_id" : 0, "name" : "NODE10:27017", "health" : 1, "state" : 2, "stateStr" : "SECONDARY", "uptime" : 26251817, "optime" : Timestamp(1464858757, 28), "optimeDate" : ISODate("2016-06-02T09:12:37Z"), "self" : true }, { "_id" : 1, "name" : "NODE11:27017", "health" : 1, "state" : 1, "stateStr" : "PRIMARY", "uptime" : 26096062, "optime" : Timestamp(1464858306, 3), "optimeDate" : ISODate("2016-06-02T09:05:06Z"), "lastHeartbeat" : ISODate("2016-06-02T09:12:36Z"), "lastHeartbeatRecv" : ISODate("2016-06-02T09:12:37Z"), "pingMs" : 0, "electionTime" : Timestamp(1438762714, 1), "electionDate" : ISODate("2015-08-05T08:18:34Z") }, { "_id" : 2, "name" : "NODE12:27017", "health" : 1, "state" : 2, "stateStr" : "SECONDARY", "uptime" : 26251812, "optime" : Timestamp(1464858306, 3), "optimeDate" : ISODate("2016-06-02T09:05:06Z"), "lastHeartbeat" : ISODate("2016-06-02T09:12:36Z"), "lastHeartbeatRecv" : ISODate("2016-06-02T09:12:37Z"), "pingMs" : 1, "syncingTo" : "NODE011:27017" } ], "ok" : 1 }
The node executing the check:
"optimeDate" : ISODate("2016-06-02T09:12:37Z"),
Primary node:
"optimeDate" : ISODate("2016-06-02T09:05:06Z"),
So lag is -451s!
Both nodes are ntp synced.