Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-24887

Replica set node allows reconfigurations that are invalid on other versions

    XMLWordPrintableJSON

Details

    • Icon: Bug Bug
    • Resolution: Done
    • Icon: Major - P3 Major - P3
    • None
    • 3.2.7
    • Replication
    • None
    • ALL
    • Hide

      1. have a replica set of 3.0.2 + MMAPv1 and 3.2.7 + WT nodes, where 3.0.2 node is primary.
      2. set one 3.0.2 node priority > 0 and votes = 0, and make one 3.2.7 node high priority so it would become master upon reconfiguration
      3. reconfigure the replicaset on 3.0.2 primary node
      4. all 3.2.7 nodes should crash

      Show
      1. have a replica set of 3.0.2 + MMAPv1 and 3.2.7 + WT nodes, where 3.0.2 node is primary. 2. set one 3.0.2 node priority > 0 and votes = 0, and make one 3.2.7 node high priority so it would become master upon reconfiguration 3. reconfigure the replicaset on 3.0.2 primary node 4. all 3.2.7 nodes should crash

    Description

      We had a replicaset with this configuration (pseudocode):

      {
      mongo1: { priority: 3, votes: 1, role: PRIMARY, mongo: 3.0.2, storage: MMAPv1 },
      mongo2: { priority: 0.5, votes: 1, role: SECONDARY, mongo: 3.0.2, storage: MMAPv1 },
      mongo3: { priority: 0.5, votes: 1, role: SECONDARY, mongo: 3.0.2, storage: MMAPV1 },
      mongo4: { priority: 0, votes: 1, role: SECONDARY, mongo: 3.2.7, storage: WiredTiger },
      mongo5: { priority: 0, votes: 1, role: SECONDARY, mongo: 3.2.7, storage: WiredTiger },
      mongo6: { priority: 0, votes: 0, role: SECONDARY, hidden: true, mongo: 3.2.7, storage: WiredTiger }
      }
      

      The plan was to switch primary to mongo4 (3.2.7 + WT), so we could upgrade mongo

      {1,2,3}, but I managed to crash mongo{4,5,6} with this, on mongo1:

      conf = {
         members: [
            { host: mongo1..., priority: 3, votes 1},
            { host: mongo2..., priority: 0.5, votes 1},
            { host: mongo3..., priority: 0.5, votes 0},  // <-- culprit?
            { host: mongo4..., priority: 5, votes 1},
            { host: mongo5..., priority: 0.8, votes 1},
            { host: mongo6..., priority: 0.8, votes 1} // was hidden before
      };
      rs.reconfig(conf);
      



      In effect, on mongo{4,5,6} mongod crashed, leaving mongo{1,2,3}

      in SECONDARY state without being able to elect a PRIMARY.

      In logs we found this:

      2016-07-04T07:05:31.842+0000 W REPL     [replExecDBWorker-1] Not persisting new configuration in heartbeat response to disk because it is invalid: BadValue: priority must be 0 when non-voting (votes:0)
      2016-07-04T07:05:31.842+0000 E REPL     [ReplicationExecutor] Could not validate configuration received from remote node; Removing self until an acceptable configuration arrives; BadValue: priority must be 0 when non-voting (votes:0)
      2016-07-04T07:05:31.842+0000 I REPL     [ReplicationExecutor] New replica set config in use: { _id: "repl1", version: 106772, members: [ { _id: 23, host: "mongo1:27017", arbiterOnly: false, buildIndexes: true, hidden: false, priority: 0.8, tags: {}, slaveDelay: 0, votes: 1 }, { _id: 24, host: "mongo2:27017", arbiterOnly: false, buildIndexes: true, hidden: false, priority: 0.4, tags: {}, slaveDelay: 0, votes: 1 }, { _id: 25, host: "mongo3:27017", arbiterOnly: false, buildIndexes: true, hidden: false, priority: 0.3, tags: {}, slaveDelay: 0, votes: 0 }, { _id: 26, host: "mongo4:27017", arbiterOnly: false, buildIndexes: true, hidden: false, priority: 3.0, tags: {}, slaveDelay: 0, votes: 1 }, { _id: 27, host: "mongo5:27017", arbiterOnly: false, buildIndexes: true, hidden: false, priority: 0.6, tags: {}, slaveDelay: 0, votes: 1 }, { _id: 28, host: "mongo6:27017", arbiterOnly: false, buildIndexes: true, hidden: false, priority: 0.5, tags: {}, slaveDelay: 0, votes: 1 } ], settings: { chainingAllowed: true, heartbeatIntervalMillis: 2000, heartbeatTimeoutSecs: 10, electionTimeoutMillis: 10000, getLastErrorModes: {}, getLastErrorDefaults: { w: 1, wtimeout: 0 } } }
      

      And then:

      2016-07-04T07:05:31.865+0000 I -        [ReplicationExecutor] Invariant failure i < _members.size() src/mongo/db/repl/replica_set_config.cpp 560
      

      I'm sure it was a mistake on my end to set priority > 0 and votes = 0 on mongo3, but the way 3.2 + WT nodes reacted was certainly not nice.

      Also, please suggest how to perform this switch and upgrade in least dangerous fashion.

      Attachments

        Activity

          People

            kelsey.schubert@mongodb.com Kelsey Schubert
            spajus Tomas Varaneckas
            Votes:
            0 Vote for this issue
            Watchers:
            9 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: