Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-36597

primary still contact removed member and now stepdown when major members are down

    XMLWordPrintableJSON

Details

    • Icon: Task Task
    • Resolution: Duplicate
    • Icon: Major - P3 Major - P3
    • None
    • 3.6.6
    • None
    • None

    Description

      all details have been provide in https://jira.mongodb.org/browse/SERVER-36512 but this topic focus on 2 new found issues 

      I am confused with Nick's feedback. If the Nick's answer is right, then where are the down major members? because of the replset has 1 primary 2 secondaries and 1 arbiter, only one secondary was unreadable when the issue is found.

      I did more investigation on the mongod.log find another 2 issues: 

        issue: primary still contact removed 3 members, fixed by restart mongod service

        issue: primary did not step down if it still thinks the replset has 6 data bearing members and 1 arbiter and 4 data bearing members are down.

       
      secondary : 172.31.54.204 (primary when the issue happen on Aug 6)
      arbiter : ip-172-31-5-208 (was 3.4.7 when the issue happen on Aug 6)
      old member 3.4.7 : ip-172-31-12-59 (removed from replset July 23)
      old member 3.4.7 : ip-172-31-20-52 (removed from replset July 23)
      old member 3.4.7 : ip-172-31-46-24 (removed from replset July 23)
      secondary : ip-172-31-66-130 (unreachable when the issue happen on Aug 6)
      primary : ip-172-31-82-157 (secondary when the issue happen on Aug 6)
      secondary : ip-172-31-67-188 (added afer the issue happen)

      All 3.6.6 now

       

       

      Attachments

        Activity

          People

            nick.brewer Nick Brewer
            brucezu Bruce Zu
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: