Strange election on network failure

XMLWordPrintableJSON

    • Type: Bug
    • Resolution: Works as Designed
    • Priority: Major - P3
    • None
    • Affects Version/s: 3.4.3
    • Component/s: Replication
    • None
    • ALL
    • None
    • 3
    • None
    • None
    • None
    • None
    • None
    • None

      The subject replicaset has 3 nodes (see rs.conf() below).
      t1 IP address is 10.3.1.12
      t2 IP address is 10.3.1.13
      t3 IP address is 10.3.1.16

      After a transient network failure (switch ports were disabled and enabled back) on the secondary (t3) it became primary, causing rollbacks on the previous primary (t1) and other secondary (t2). All writes are done with w:majority, so this is really strange. Logs from all three machines are attached.

      rs.conf()

      {
              "_id" : "driveFS-temp-1",
              "version" : 4,
              "protocolVersion" : NumberLong(1),
              "writeConcernMajorityJournalDefault" : false,
              "members" : [
                      {
                              "_id" : 0,
                              "host" : "t1.s1.fs.drive.bru:27231",
                              "arbiterOnly" : false,
                              "buildIndexes" : true,
                              "hidden" : false,
                              "priority" : 1,
                              "tags" : {
      
                              },
                              "slaveDelay" : NumberLong(0),
                              "votes" : 1
                      },
                      {
                              "_id" : 1,
                              "host" : "t2.s1.fs.drive.bru:27231",
                              "arbiterOnly" : false,
                              "buildIndexes" : true,
                              "hidden" : false,
                              "priority" : 1,
                              "tags" : {
      
                              },
                              "slaveDelay" : NumberLong(0),
                              "votes" : 1
                      },
                      {
                              "_id" : 2,
                              "host" : "t3.s1.fs.drive.bru:27231",
                              "arbiterOnly" : false,
                              "buildIndexes" : true,
                              "hidden" : false,
                              "priority" : 1,
                              "tags" : {
      
                              },
                              "slaveDelay" : NumberLong(0),
                              "votes" : 1
                      }
              ],
              "settings" : {
                      "chainingAllowed" : true,
                      "heartbeatIntervalMillis" : 2000,
                      "heartbeatTimeoutSecs" : 10,
                      "electionTimeoutMillis" : 5000,
                      "catchUpTimeoutMillis" : 2000,
                      "getLastErrorModes" : {
      
                      },
                      "getLastErrorDefaults" : {
                              "w" : 1,
                              "wtimeout" : 0
                      },
                      "replicaSetId" : ObjectId("58c9657b40aba377920b23f2")
              }
      }
      

        1. t1.log.gz
          16 kB
          Aristarkh Zagorodnikov
        2. t2.log.gz
          15 kB
          Aristarkh Zagorodnikov
        3. t3.log.gz
          16 kB
          Aristarkh Zagorodnikov

            Assignee:
            Kelsey Schubert
            Reporter:
            Aristarkh Zagorodnikov
            Votes:
            1 Vote for this issue
            Watchers:
            7 Start watching this issue

              Created:
              Updated:
              Resolved: