Strange election on network failure

XMLWordPrintableJSON

    • Type: Bug
    • Resolution: Works as Designed
    • Priority: Major - P3
    • None
    • Affects Version/s: 3.4.3
    • Component/s: Replication
    • None
    • ALL
    • None
    • 3
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      The subject replicaset has 3 nodes (see rs.conf() below).
      t1 IP address is 10.3.1.12
      t2 IP address is 10.3.1.13
      t3 IP address is 10.3.1.16

      After a transient network failure (switch ports were disabled and enabled back) on the secondary (t3) it became primary, causing rollbacks on the previous primary (t1) and other secondary (t2). All writes are done with w:majority, so this is really strange. Logs from all three machines are attached.

      rs.conf()

      {
              "_id" : "driveFS-temp-1",
              "version" : 4,
              "protocolVersion" : NumberLong(1),
              "writeConcernMajorityJournalDefault" : false,
              "members" : [
                      {
                              "_id" : 0,
                              "host" : "t1.s1.fs.drive.bru:27231",
                              "arbiterOnly" : false,
                              "buildIndexes" : true,
                              "hidden" : false,
                              "priority" : 1,
                              "tags" : {
      
                              },
                              "slaveDelay" : NumberLong(0),
                              "votes" : 1
                      },
                      {
                              "_id" : 1,
                              "host" : "t2.s1.fs.drive.bru:27231",
                              "arbiterOnly" : false,
                              "buildIndexes" : true,
                              "hidden" : false,
                              "priority" : 1,
                              "tags" : {
      
                              },
                              "slaveDelay" : NumberLong(0),
                              "votes" : 1
                      },
                      {
                              "_id" : 2,
                              "host" : "t3.s1.fs.drive.bru:27231",
                              "arbiterOnly" : false,
                              "buildIndexes" : true,
                              "hidden" : false,
                              "priority" : 1,
                              "tags" : {
      
                              },
                              "slaveDelay" : NumberLong(0),
                              "votes" : 1
                      }
              ],
              "settings" : {
                      "chainingAllowed" : true,
                      "heartbeatIntervalMillis" : 2000,
                      "heartbeatTimeoutSecs" : 10,
                      "electionTimeoutMillis" : 5000,
                      "catchUpTimeoutMillis" : 2000,
                      "getLastErrorModes" : {
      
                      },
                      "getLastErrorDefaults" : {
                              "w" : 1,
                              "wtimeout" : 0
                      },
                      "replicaSetId" : ObjectId("58c9657b40aba377920b23f2")
              }
      }
      

        1. t1.log.gz
          16 kB
        2. t2.log.gz
          15 kB
        3. t3.log.gz
          16 kB

              Assignee:
              Kelsey Schubert
              Reporter:
              Aristarkh Zagorodnikov
              Votes:
              1 Vote for this issue
              Watchers:
              7 Start watching this issue

                Created:
                Updated:
                Resolved: