Details
-
Bug
-
Resolution: Done
-
Major - P3
-
None
-
3.2.7
-
None
-
ALL
-
Description
We had a replicaset with this configuration (pseudocode):
{
|
mongo1: { priority: 3, votes: 1, role: PRIMARY, mongo: 3.0.2, storage: MMAPv1 },
|
mongo2: { priority: 0.5, votes: 1, role: SECONDARY, mongo: 3.0.2, storage: MMAPv1 },
|
mongo3: { priority: 0.5, votes: 1, role: SECONDARY, mongo: 3.0.2, storage: MMAPV1 },
|
mongo4: { priority: 0, votes: 1, role: SECONDARY, mongo: 3.2.7, storage: WiredTiger },
|
mongo5: { priority: 0, votes: 1, role: SECONDARY, mongo: 3.2.7, storage: WiredTiger },
|
mongo6: { priority: 0, votes: 0, role: SECONDARY, hidden: true, mongo: 3.2.7, storage: WiredTiger }
|
}
|
The plan was to switch primary to mongo4 (3.2.7 + WT), so we could upgrade mongo
{1,2,3}, but I managed to crash mongo{4,5,6} with this, on mongo1:
conf = {
|
members: [
|
{ host: mongo1..., priority: 3, votes 1},
|
{ host: mongo2..., priority: 0.5, votes 1},
|
{ host: mongo3..., priority: 0.5, votes 0}, // <-- culprit?
|
{ host: mongo4..., priority: 5, votes 1},
|
{ host: mongo5..., priority: 0.8, votes 1},
|
{ host: mongo6..., priority: 0.8, votes 1} // was hidden before
|
};
|
rs.reconfig(conf);
|
In effect, on mongo{4,5,6} mongod crashed, leaving mongo{1,2,3}
in SECONDARY state without being able to elect a PRIMARY.
In logs we found this:
2016-07-04T07:05:31.842+0000 W REPL [replExecDBWorker-1] Not persisting new configuration in heartbeat response to disk because it is invalid: BadValue: priority must be 0 when non-voting (votes:0)
|
2016-07-04T07:05:31.842+0000 E REPL [ReplicationExecutor] Could not validate configuration received from remote node; Removing self until an acceptable configuration arrives; BadValue: priority must be 0 when non-voting (votes:0)
|
2016-07-04T07:05:31.842+0000 I REPL [ReplicationExecutor] New replica set config in use: { _id: "repl1", version: 106772, members: [ { _id: 23, host: "mongo1:27017", arbiterOnly: false, buildIndexes: true, hidden: false, priority: 0.8, tags: {}, slaveDelay: 0, votes: 1 }, { _id: 24, host: "mongo2:27017", arbiterOnly: false, buildIndexes: true, hidden: false, priority: 0.4, tags: {}, slaveDelay: 0, votes: 1 }, { _id: 25, host: "mongo3:27017", arbiterOnly: false, buildIndexes: true, hidden: false, priority: 0.3, tags: {}, slaveDelay: 0, votes: 0 }, { _id: 26, host: "mongo4:27017", arbiterOnly: false, buildIndexes: true, hidden: false, priority: 3.0, tags: {}, slaveDelay: 0, votes: 1 }, { _id: 27, host: "mongo5:27017", arbiterOnly: false, buildIndexes: true, hidden: false, priority: 0.6, tags: {}, slaveDelay: 0, votes: 1 }, { _id: 28, host: "mongo6:27017", arbiterOnly: false, buildIndexes: true, hidden: false, priority: 0.5, tags: {}, slaveDelay: 0, votes: 1 } ], settings: { chainingAllowed: true, heartbeatIntervalMillis: 2000, heartbeatTimeoutSecs: 10, electionTimeoutMillis: 10000, getLastErrorModes: {}, getLastErrorDefaults: { w: 1, wtimeout: 0 } } }
|
And then:
2016-07-04T07:05:31.865+0000 I - [ReplicationExecutor] Invariant failure i < _members.size() src/mongo/db/repl/replica_set_config.cpp 560
|
I'm sure it was a mistake on my end to set priority > 0 and votes = 0 on mongo3, but the way 3.2 + WT nodes reacted was certainly not nice.
Also, please suggest how to perform this switch and upgrade in least dangerous fashion.
Attachments
Issue Links
- related to
-
DOCS-8579 Replica Set Upgrade and configuration validation
-
- Closed
-