Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-24360

opdate higher in secondary

    • Type: Icon: Question Question
    • Resolution: Done
    • Priority: Icon: Trivial - P5 Trivial - P5
    • None
    • Affects Version/s: 2.6.8
    • Component/s: Replication
    • None
    • None
    • 0
    • None
    • None
    • None
    • None
    • None
    • None

      I am using this check https://github.com/mzupan/nagios-plugin-mongodb/blob/master/check_mongodb.py to monitor replication lag between nodes in a replica set.

      From time to time I have observed big lags between nodes (> 1000s).
      I was guessing it could be a bug of the check, so I executed a second script obtaining info from the replica set when the check fails, and it shows this:

      MongoDB shell version: 2.6.8
      connecting to: test
      {
              "set" : "cyclops",
              "date" : ISODate("2016-06-02T09:12:38Z"),
              "myState" : 2,
              "syncingTo" : "NODE11:27017",
              "members" : [
                      {
                              "_id" : 0,
                              "name" : "NODE10:27017",
                              "health" : 1,
                              "state" : 2,
                              "stateStr" : "SECONDARY",
                              "uptime" : 26251817,
                              "optime" : Timestamp(1464858757, 28),
                              "optimeDate" : ISODate("2016-06-02T09:12:37Z"),
                              "self" : true
                      },
                      {
                              "_id" : 1,
                              "name" : "NODE11:27017",
                              "health" : 1,
                              "state" : 1,
                              "stateStr" : "PRIMARY",
                              "uptime" : 26096062,
                              "optime" : Timestamp(1464858306, 3),
                              "optimeDate" : ISODate("2016-06-02T09:05:06Z"),
                              "lastHeartbeat" : ISODate("2016-06-02T09:12:36Z"),
                              "lastHeartbeatRecv" : ISODate("2016-06-02T09:12:37Z"),
                              "pingMs" : 0,
                              "electionTime" : Timestamp(1438762714, 1),
                              "electionDate" : ISODate("2015-08-05T08:18:34Z")
                      },
                      {
                              "_id" : 2,
                              "name" : "NODE12:27017",
                              "health" : 1,
                              "state" : 2,
                              "stateStr" : "SECONDARY",
                              "uptime" : 26251812,
                              "optime" : Timestamp(1464858306, 3),
                              "optimeDate" : ISODate("2016-06-02T09:05:06Z"),
                              "lastHeartbeat" : ISODate("2016-06-02T09:12:36Z"),
                              "lastHeartbeatRecv" : ISODate("2016-06-02T09:12:37Z"),
                              "pingMs" : 1,
                              "syncingTo" : "NODE011:27017"
                      }
              ],
              "ok" : 1
      }
      

      The node executing the check:
      "optimeDate" : ISODate("2016-06-02T09:12:37Z"),

      Primary node:
      "optimeDate" : ISODate("2016-06-02T09:05:06Z"),

      So lag is -451s!

      Both nodes are ntp synced.

            Assignee:
            Unassigned Unassigned
            Reporter:
            adrianlzt adrian
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

              Created:
              Updated:
              Resolved: