[SERVER-9283] lastHeartbeatRecv flaps between invalid and valid dates Created: 08/Apr/13  Updated: 11/Jul/16  Resolved: 23/May/13

Status: Closed
Project: Core Server
Component/s: Replication
Affects Version/s: 2.4.1, 2.5.0
Fix Version/s: 2.4.5, 2.5.1

Type: Bug Priority: Major - P3
Reporter: Hannes Magnusson Assignee: Randolph Tan
Resolution: Done Votes: 2
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

build info: Darwin anden.local 11.4.2 Darwin Kernel Version 11.4.2: Thu Aug 23 16:25:48 PDT 2012; root:xnu-1699.32.7~1/RELEASE_X86_64 x86_64 BOOST_LIB_VERSION=1_49
git version: 29bc5063bd232aa1664511d4486936d1f13e60b6


Issue Links:
Depends
Related
related to SERVER-11280 ReplSetHealthPollTask::down uses m.la... Closed
is related to SERVER-9730 Asymmetrical network partition can ca... Closed
is related to SERVER-9756 network partition where minority was ... Closed
is related to SERVER-9765 Two primaries should cause the earlie... Closed
Backwards Compatibility: Fully Compatible
Operating System: ALL
Participants:

 Description   

Every once in a while the lastHeartbeatRecv date stamp gets screwed up while initialising a replicaset.

Note: This is completely fresh replicaset with no data inserted yet

anden(mongod-2.5.0-pre-)[PRIMARY/failover] test> rs.status()
{
  "set": "failover",
  "date": ISODate("2013-04-08T12:03:12-0700"),
  "myState": 1,
  "members": [
    {
      "_id": 0,
      "name": "anden.local:1337",
      "health": 1,
      "state": 1,
      "stateStr": "PRIMARY",
      "uptime": 86,
      "optime": {
        "t": 1365447776,
        "i": 1
      },
      "optimeDate": ISODate("2013-04-08T12:02:56-0700"),
      "self": true
    },
    {
      "_id": 1,
      "name": "anden.local:1338",
      "health": 1,
      "state": 5,
      "stateStr": "STARTUP2",
      "uptime": 16,
      "optime": {
        "t": 0,
        "i": 0
      },
      "optimeDate": ISODate("1969-12-31T16:00:00-0800"),
      "lastHeartbeat": ISODate("2013-04-08T12:03:12-0700"),
      "lastHeartbeatRecv": ISODate("1969-12-31T16:00:00-0800"),
      "pingMs": 0,
      "lastHeartbeatMessage": "initial sync need a member to be primary or secondary to do our initial sync"
    }
  ],
  "ok": 1
}
anden(mongod-2.5.0-pre-)[PRIMARY/failover] test> rs.status()
{
  "set": "failover",
  "date": ISODate("2013-04-08T12:03:13-0700"),
  "myState": 1,
  "members": [
    {
      "_id": 0,
      "name": "anden.local:1337",
      "health": 1,
      "state": 1,
      "stateStr": "PRIMARY",
      "uptime": 87,
      "optime": {
        "t": 1365447776,
        "i": 1
      },
      "optimeDate": ISODate("2013-04-08T12:02:56-0700"),
      "self": true
    },
    {
      "_id": 1,
      "name": "anden.local:1338",
      "health": 1,
      "state": 5,
      "stateStr": "STARTUP2",
      "uptime": 17,
      "optime": {
        "t": 0,
        "i": 0
      },
      "optimeDate": ISODate("1969-12-31T16:00:00-0800"),
      "lastHeartbeat": ISODate("2013-04-08T12:03:12-0700"),
      "lastHeartbeatRecv": ISODate("2013-04-08T12:03:13-0700"),
      "pingMs": 0,
      "lastHeartbeatMessage": "initial sync need a member to be primary or secondary to do our initial sync"
    }
  ],
  "ok": 1
}
anden(mongod-2.5.0-pre-)[PRIMARY/failover] test> rs.status()
{
  "set": "failover",
  "date": ISODate("2013-04-08T12:03:13-0700"),
  "myState": 1,
  "members": [
    {
      "_id": 0,
      "name": "anden.local:1337",
      "health": 1,
      "state": 1,
      "stateStr": "PRIMARY",
      "uptime": 87,
      "optime": {
        "t": 1365447776,
        "i": 1
      },
      "optimeDate": ISODate("2013-04-08T12:02:56-0700"),
      "self": true
    },
    {
      "_id": 1,
      "name": "anden.local:1338",
      "health": 1,
      "state": 5,
      "stateStr": "STARTUP2",
      "uptime": 17,
      "optime": {
        "t": 0,
        "i": 0
      },
      "optimeDate": ISODate("1969-12-31T16:00:00-0800"),
      "lastHeartbeat": ISODate("2013-04-08T12:03:12-0700"),
      "lastHeartbeatRecv": ISODate("2013-04-08T12:03:13-0700"),
      "pingMs": 0,
      "lastHeartbeatMessage": "initial sync need a member to be primary or secondary to do our initial sync"
    }
  ],
  "ok": 1
}
anden(mongod-2.5.0-pre-)[PRIMARY/failover] test> rs.status()
{
  "set": "failover",
  "date": ISODate("2013-04-08T12:03:14-0700"),
  "myState": 1,
  "members": [
    {
      "_id": 0,
      "name": "anden.local:1337",
      "health": 1,
      "state": 1,
      "stateStr": "PRIMARY",
      "uptime": 88,
      "optime": {
        "t": 1365447776,
        "i": 1
      },
      "optimeDate": ISODate("2013-04-08T12:02:56-0700"),
      "self": true
    },
    {
      "_id": 1,
      "name": "anden.local:1338",
      "health": 1,
      "state": 5,
      "stateStr": "STARTUP2",
      "uptime": 18,
      "optime": {
        "t": 0,
        "i": 0
      },
      "optimeDate": ISODate("1969-12-31T16:00:00-0800"),
      "lastHeartbeat": ISODate("2013-04-08T12:03:12-0700"),
      "lastHeartbeatRecv": ISODate("2013-04-08T12:03:13-0700"),
      "pingMs": 0,
      "lastHeartbeatMessage": "initial sync need a member to be primary or secondary to do our initial sync"
    }
  ],
  "ok": 1
}
anden(mongod-2.5.0-pre-)[PRIMARY/failover] test> rs.status()
{
  "set": "failover",
  "date": ISODate("2013-04-08T12:03:14-0700"),
  "myState": 1,
  "members": [
    {
      "_id": 0,
      "name": "anden.local:1337",
      "health": 1,
      "state": 1,
      "stateStr": "PRIMARY",
      "uptime": 88,
      "optime": {
        "t": 1365447776,
        "i": 1
      },
      "optimeDate": ISODate("2013-04-08T12:02:56-0700"),
      "self": true
    },
    {
      "_id": 1,
      "name": "anden.local:1338",
      "health": 1,
      "state": 5,
      "stateStr": "STARTUP2",
      "uptime": 18,
      "optime": {
        "t": 0,
        "i": 0
      },
      "optimeDate": ISODate("1969-12-31T16:00:00-0800"),
      "lastHeartbeat": ISODate("2013-04-08T12:03:14-0700"),
      "lastHeartbeatRecv": ISODate("1969-12-31T16:00:00-0800"),
      "pingMs": 0,
      "lastHeartbeatMessage": "initial sync need a member to be primary or secondary to do our initial sync"
    }
  ],
  "ok": 1
}



 Comments   
Comment by auto [ 19/Jun/13 ]

Author:

{u'username': u'renctan', u'name': u'Randolph Tan', u'email': u'randolph@10gen.com'}

Message: SERVER-9283 lastHeartbeatRecv flaps between invalid and valid dates during replicaset initilization

Do not overwrite lastHeartbeatRecv with 0 value everytime HeartbeatInfo gets updated.
Branch: v2.4
https://github.com/mongodb/mongo/commit/3b0ac30a7b16bddde3788f35d4e16d1777fdd8a3

Comment by auto [ 23/May/13 ]

Author:

{u'username': u'renctan', u'name': u'Randolph Tan', u'email': u'randolph@10gen.com'}

Message: SERVER-9283 lastHeartbeatRecv flaps between invalid and valid dates during replicaset initilization

Do not overwrite lastHeartbeatRecv with 0 value everytime HeartbeatInfo gets updated.
Branch: master
https://github.com/mongodb/mongo/commit/9b894e349663e4bf10625b0f35844cb63311a10c

Comment by Jenny Duckett [ 07/May/13 ]

I'm also seeing this very frequently on our replica set, which contains data and is in use. Nearly every time I run rs.status() on any member of the set, for at least one of the other two members it will show "lastHeartbeatRecv" : ISODate("1970-01-01T00:00:00Z"). I haven't been able to find any pattern to the flapping between valid and invalid dates.

Our replica set is now running version 2.4.3. We started seeing this issue after upgrading from 2.2.3 to 2.4.1. At the same time we also started to be affected by SERVER-9085, which is now resolved. I'd hoped that the fix for that introduced in 2.4.2 might also fix the lastHeartbeatRecv issue, but it hasn't.

Comment by Matt Campbell [ 11/Apr/13 ]

I'm also seeing this on a replica set when just running rs.status() repeatedly.

The lastHeartbeatRecv bounces from ISODate("1970-01-01T00:00:00Z") to ISODate("2013-04-11T01:26:11Z").

Note: this is happening WITHOUT data in the rs - as it was mentioned below it occurred after adding data.

Comment by Hannes Magnusson [ 08/Apr/13 ]

Hmh. I am seeing this on a running replicaset too, after inserting some data

Generated at Thu Feb 08 03:19:56 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.