Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-10375

DNS failures can cause a primary-less state that wouldn't exist if a node had gone down entirely

    XMLWordPrintable

Details

    • Bug
    • Status: Closed
    • Major - P3
    • Resolution: Gone away
    • 2.4.5
    • None
    • Replication
    • ALL
    • Hide

      Build a cluster of three nodes between two servers (SECONDARY and ARBITER on one server, PRIMARY on the other)

      Remove the entries for the SECONDARY and ARBITER hosts from the PRIMARY's /etc/hosts, simulating the loss of DNS resolution for that node

      The SECONDARY and ARBITER still have /etc/hosts records for the PRIMARY (simulating that DNS still works for them)

      The PRIMARY will step down, but the SECONDARY will not run for election as the (now former) PRIMARY would veto.

      Show
      Build a cluster of three nodes between two servers (SECONDARY and ARBITER on one server, PRIMARY on the other) Remove the entries for the SECONDARY and ARBITER hosts from the PRIMARY's /etc/hosts, simulating the loss of DNS resolution for that node The SECONDARY and ARBITER still have /etc/hosts records for the PRIMARY (simulating that DNS still works for them) The PRIMARY will step down, but the SECONDARY will not run for election as the (now former) PRIMARY would veto.

    Description

      If you end up with a one-way DNS partition, the PRIMARY will step down, but the SECONDARYs will not run for election as they believe the (former) PRIMARY will veto.

      Maybe, in the case that a node cannot see a candidate, they should not veto and instead vote when the actual election starts.

      Attachments

        Issue Links

          Activity

            People

              milkie@mongodb.com Eric Milkie
              matt.dannenberg Matt Dannenberg
              Votes:
              8 Vote for this issue
              Watchers:
              18 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: