Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-24887

Replica set node allows reconfigurations that are invalid on other versions

    • Type: Icon: Bug Bug
    • Resolution: Done
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: 3.2.7
    • Component/s: Replication
    • None
    • ALL
    • Hide

      1. have a replica set of 3.0.2 + MMAPv1 and 3.2.7 + WT nodes, where 3.0.2 node is primary.
      2. set one 3.0.2 node priority > 0 and votes = 0, and make one 3.2.7 node high priority so it would become master upon reconfiguration
      3. reconfigure the replicaset on 3.0.2 primary node
      4. all 3.2.7 nodes should crash

      Show
      1. have a replica set of 3.0.2 + MMAPv1 and 3.2.7 + WT nodes, where 3.0.2 node is primary. 2. set one 3.0.2 node priority > 0 and votes = 0, and make one 3.2.7 node high priority so it would become master upon reconfiguration 3. reconfigure the replicaset on 3.0.2 primary node 4. all 3.2.7 nodes should crash
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      We had a replicaset with this configuration (pseudocode):

      {
      mongo1: { priority: 3, votes: 1, role: PRIMARY, mongo: 3.0.2, storage: MMAPv1 },
      mongo2: { priority: 0.5, votes: 1, role: SECONDARY, mongo: 3.0.2, storage: MMAPv1 },
      mongo3: { priority: 0.5, votes: 1, role: SECONDARY, mongo: 3.0.2, storage: MMAPV1 },
      mongo4: { priority: 0, votes: 1, role: SECONDARY, mongo: 3.2.7, storage: WiredTiger },
      mongo5: { priority: 0, votes: 1, role: SECONDARY, mongo: 3.2.7, storage: WiredTiger },
      mongo6: { priority: 0, votes: 0, role: SECONDARY, hidden: true, mongo: 3.2.7, storage: WiredTiger }
      }
      

      The plan was to switch primary to mongo4 (3.2.7 + WT), so we could upgrade mongo

      {1,2,3}, but I managed to crash mongo{4,5,6} with this, on mongo1:
      Unable to find source-code formatter for language: javascript. Available languages are: actionscript, ada, applescript, bash, c, c#, c++, cpp, css, erlang, go, groovy, haskell, html, java, javascript, js, json, lua, none, nyan, objc, perl, php, python, r, rainbow, ruby, scala, sh, sql, swift, visualbasic, xml, yaml
      conf = {
         members: [
            { host: mongo1..., priority: 3, votes 1},
            { host: mongo2..., priority: 0.5, votes 1},
            { host: mongo3..., priority: 0.5, votes 0},  // <-- culprit?
            { host: mongo4..., priority: 5, votes 1},
            { host: mongo5..., priority: 0.8, votes 1},
            { host: mongo6..., priority: 0.8, votes 1} // was hidden before
      };
      rs.reconfig(conf);
      


      In effect, on mongo{4,5,6} mongod crashed, leaving mongo{1,2,3}

      in SECONDARY state without being able to elect a PRIMARY.

      In logs we found this:

      2016-07-04T07:05:31.842+0000 W REPL     [replExecDBWorker-1] Not persisting new configuration in heartbeat response to disk because it is invalid: BadValue: priority must be 0 when non-voting (votes:0)
      2016-07-04T07:05:31.842+0000 E REPL     [ReplicationExecutor] Could not validate configuration received from remote node; Removing self until an acceptable configuration arrives; BadValue: priority must be 0 when non-voting (votes:0)
      2016-07-04T07:05:31.842+0000 I REPL     [ReplicationExecutor] New replica set config in use: { _id: "repl1", version: 106772, members: [ { _id: 23, host: "mongo1:27017", arbiterOnly: false, buildIndexes: true, hidden: false, priority: 0.8, tags: {}, slaveDelay: 0, votes: 1 }, { _id: 24, host: "mongo2:27017", arbiterOnly: false, buildIndexes: true, hidden: false, priority: 0.4, tags: {}, slaveDelay: 0, votes: 1 }, { _id: 25, host: "mongo3:27017", arbiterOnly: false, buildIndexes: true, hidden: false, priority: 0.3, tags: {}, slaveDelay: 0, votes: 0 }, { _id: 26, host: "mongo4:27017", arbiterOnly: false, buildIndexes: true, hidden: false, priority: 3.0, tags: {}, slaveDelay: 0, votes: 1 }, { _id: 27, host: "mongo5:27017", arbiterOnly: false, buildIndexes: true, hidden: false, priority: 0.6, tags: {}, slaveDelay: 0, votes: 1 }, { _id: 28, host: "mongo6:27017", arbiterOnly: false, buildIndexes: true, hidden: false, priority: 0.5, tags: {}, slaveDelay: 0, votes: 1 } ], settings: { chainingAllowed: true, heartbeatIntervalMillis: 2000, heartbeatTimeoutSecs: 10, electionTimeoutMillis: 10000, getLastErrorModes: {}, getLastErrorDefaults: { w: 1, wtimeout: 0 } } }
      

      And then:

      2016-07-04T07:05:31.865+0000 I -        [ReplicationExecutor] Invariant failure i < _members.size() src/mongo/db/repl/replica_set_config.cpp 560
      

      I'm sure it was a mistake on my end to set priority > 0 and votes = 0 on mongo3, but the way 3.2 + WT nodes reacted was certainly not nice.

      Also, please suggest how to perform this switch and upgrade in least dangerous fashion.

            Assignee:
            kelsey.schubert@mongodb.com Kelsey Schubert
            Reporter:
            spajus Tomas Varaneckas
            Votes:
            0 Vote for this issue
            Watchers:
            9 Start watching this issue

              Created:
              Updated:
              Resolved: