Loading...

XML

Word

Printable

JSON

Type: Question
Resolution: Done
Priority: Trivial - P5
Fix Version/s: None
Affects Version/s: 2.6.8
Component/s: Replication
Labels:
None

CAR Domain/s:
None

Aha! Reference:
None
Tracking Level:
None
Risk Status:
None
Exec Notes:
None
Goal Name(s):
None
Goal Link:
None

I am using this check https://github.com/mzupan/nagios-plugin-mongodb/blob/master/check_mongodb.py to monitor replication lag between nodes in a replica set.

From time to time I have observed big lags between nodes (> 1000s).
I was guessing it could be a bug of the check, so I executed a second script obtaining info from the replica set when the check fails, and it shows this:

MongoDB shell version: 2.6.8
connecting to: test
{
        "set" : "cyclops",
        "date" : ISODate("2016-06-02T09:12:38Z"),
        "myState" : 2,
        "syncingTo" : "NODE11:27017",
        "members" : [
                {
                        "_id" : 0,
                        "name" : "NODE10:27017",
                        "health" : 1,
                        "state" : 2,
                        "stateStr" : "SECONDARY",
                        "uptime" : 26251817,
                        "optime" : Timestamp(1464858757, 28),
                        "optimeDate" : ISODate("2016-06-02T09:12:37Z"),
                        "self" : true
                },
                {
                        "_id" : 1,
                        "name" : "NODE11:27017",
                        "health" : 1,
                        "state" : 1,
                        "stateStr" : "PRIMARY",
                        "uptime" : 26096062,
                        "optime" : Timestamp(1464858306, 3),
                        "optimeDate" : ISODate("2016-06-02T09:05:06Z"),
                        "lastHeartbeat" : ISODate("2016-06-02T09:12:36Z"),
                        "lastHeartbeatRecv" : ISODate("2016-06-02T09:12:37Z"),
                        "pingMs" : 0,
                        "electionTime" : Timestamp(1438762714, 1),
                        "electionDate" : ISODate("2015-08-05T08:18:34Z")
                },
                {
                        "_id" : 2,
                        "name" : "NODE12:27017",
                        "health" : 1,
                        "state" : 2,
                        "stateStr" : "SECONDARY",
                        "uptime" : 26251812,
                        "optime" : Timestamp(1464858306, 3),
                        "optimeDate" : ISODate("2016-06-02T09:05:06Z"),
                        "lastHeartbeat" : ISODate("2016-06-02T09:12:36Z"),
                        "lastHeartbeatRecv" : ISODate("2016-06-02T09:12:37Z"),
                        "pingMs" : 1,
                        "syncingTo" : "NODE011:27017"
                }
        ],
        "ok" : 1
}

The node executing the check:
"optimeDate" : ISODate("2016-06-02T09:12:37Z"),

Primary node:
"optimeDate" : ISODate("2016-06-02T09:05:06Z"),

So lag is -451s!

Both nodes are ntp synced.

Assignee:: Unassigned
Reporter:: adrian
Participants:: adrian, Eric Milkie, Ramon Fernandez
Votes:: 0 Vote for this issue
Watchers:: 5 Start watching this issue

Created:: Jun 02 2016 02:49:06 PM UTC
Updated:: Jul 14 2016 04:05:07 PM UTC
Resolved:: Jun 20 2016 03:43:42 PM UTC

Details

Description

Attachments

Activity

People

Dates