Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-3308

Replication Set secondary doesn't restart replication after a network glitch

    • Type: Icon: Bug Bug
    • Resolution: Incomplete
    • Priority: Icon: Critical - P2 Critical - P2
    • None
    • Affects Version/s: 1.8.1
    • Component/s: Replication
    • Labels:
      None
    • Environment:
      Linux
    • ALL

      The following errors in the secondary log indicate that it had trouble accessing either of the other DBs:

      Tue Jun 21 08:58:28 [ReplSetHealthPollTask] DBClientCursor::init call() failed
      Tue Jun 21 08:58:28 [ReplSetHealthPollTask] replSet info prod-c0-pacmandb2 is down (or slow to respond): DBClientBase::findOne: transport error: prod-c0-pacmandb2 query: { replSetHeartbeat: "pacman", v: 2, pv: 1, checkEmpty: false, from: "lab-c0-pacmandb1.lab" }
      Tue Jun 21 08:58:30 [ReplSetHealthPollTask] DBClientCursor::init call() failed
      Tue Jun 21 08:58:30 [ReplSetHealthPollTask] replSet info prod-c0-pacmandb1 is down (or slow to respond): DBClientBase::findOne: transport error: prod-c0-pacmandb1 query: { replSetHeartbeat: "pacman", v: 2, pv: 1, checkEmpty: false, from: "lab-c0-pacmandb1.lab" }
      Tue Jun 21 08:59:33 [ReplSetHealthPollTask] replSet info prod-c0-pacmandb2 is up
      Tue Jun 21 08:59:34 [initandlisten] connection accepted from 10.10.***.***:54941 #1492
      Tue Jun 21 08:59:34 [initandlisten] connection accepted from 10.10.***.***:33786 #1493
      Tue Jun 21 08:59:36 [ReplSetHealthPollTask] replSet info prod-c0-pacmandb1 is up
      

      It failed to replicate for over an hour, and only a restart of the secondary DB seems to have fixed the problem. This was not a master log corruption issue because the other secondary was syncing just fine.

            Assignee:
            kristina Kristina Chodorow (Inactive)
            Reporter:
            mnorman Michael D. Norman
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Created:
              Updated:
              Resolved: