Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-18847

MMS agent is dropping connection to a member of a replica-set

    • Type: Icon: Bug Bug
    • Resolution: Done
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: None
    • Component/s: None
    • Labels:
      None
    • ALL

      This is a weird behavior noticed on MMS. The member which is Primary is up and running fine but MMS is dropping it after initially monitoring for a few seconds and logs the following error:

      Failure dialing host after 14ms. Skipping all `MongoDB Non-Blocking` tasks. Failure on MongoDB dial to `aus-mongo1.indeed.net:27055`. Err: `no reachable servers` at monitoring-agent/components/conf.go:164 at monitoring-agent/components/bus.go:362 at monitoring-agent/components/bus.go:391 at monitoring-agent/components/bus.go:337 at pkg/runtime/proc.c:1445
      

      I tried to add the service back to the monitoring twice with MongoDB Username.Password Auth Mechanism by providing admin user and password with SSL option ON. It was able to show this host in green for a few seconds before marking it as unreachable again.

      The MMS Group is Indeed.com and the set name is called acme-aus-prod. This is a Production Primary server in the replica set and we do need to monitor it. Any insight would be helpful.
      Thanks.

            Assignee:
            Unassigned Unassigned
            Reporter:
            manan@indeed.com Manan Shah
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Created:
              Updated:
              Resolved: