[SERVER-13628] Votes:0 node can call election and become primary Created: 17/Apr/14  Updated: 10/Dec/14  Resolved: 20/May/14

Status: Closed
Project: Core Server
Component/s: Replication
Affects Version/s: None
Fix Version/s: None

Type: Question Priority: Major - P3
Reporter: Asya Kamsky Assignee: Asya Kamsky
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Participants:

 Description   

I'm not 100% clear on whether this is an arbiter bug or vote:0 node bug.

I created rs.conf()

[
           { _id: 0.0, host: "localhost:27017", priority: 2.0, votes: 0 },
           { _id: 1.0, host: "localhost:29017", votes: 0.0, priority: 0.0, hidden: true },
           { _id: 2.0, host: "localhost:33333", arbiterOnly: true }
]

So there is one vote only in the cluster and it belongs to the arbiter. I was under the impression that a non-voting node cannot call for an election (can only veto an election). Clearly arbiter, who is not eligible to be a primary cannot call for an election (right?).

And yet when I started up node _id:0 (only eligible for primary node) it got elected.

_id:0 node log:

2014-04-16T12:25:58.002-0400 [rsStart] replSet STARTUP2
2014-04-16T12:25:58.003-0400 [rsSync] replSet SECONDARY
2014-04-16T12:25:58.284-0400 [rsHealthPoll] replSet member localhost:33333 is up
2014-04-16T12:25:58.284-0400 [rsHealthPoll] replSet member localhost:33333 is now in state ARBITER
2014-04-16T12:25:58.285-0400 [rsHealthPoll] replSet member localhost:29017 is up
2014-04-16T12:25:58.285-0400 [rsHealthPoll] replSet member localhost:29017 is now in state SECONDARY
2014-04-16T12:25:58.302-0400 [rsMgr] not electing self, localhost:33333 would veto with 'I don't think localhost:27017 is electable'
2014-04-16T12:25:58.304-0400 [rsMgr] not electing self, localhost:33333 would veto with 'I don't think localhost:27017 is electable'
2014-04-16T12:25:58.452-0400 [conn1] end connection 127.0.0.1:63684 (1 connection now open)
2014-04-16T12:25:58.452-0400 [initandlisten] connection accepted from 127.0.0.1:63689 #4 (3 connections now open)
2014-04-16T12:26:04.288-0400 [rsMgr] replSet info electSelf 0
2014-04-16T12:26:04.458-0400 [conn4] end connection 127.0.0.1:63689 (1 connection now open)
2014-04-16T12:26:04.458-0400 [initandlisten] connection accepted from 127.0.0.1:63690 #5 (3 connections now open)
2014-04-16T12:26:05.011-0400 [rsMgr] replSet PRIMARY
2014-04-16T12:26:05.146-0400 [conn3] end connection 127.0.0.1:63686 (1 connection now open)
2014-04-16T12:26:05.147-0400 [initandlisten] connection accepted from 127.0.0.1:63693 #6 (2 connections now open)

arbiter log:

2014-04-16T12:25:57.065-0400 [conn1589] run command admin.$cmd { replSetHeartbeat: "asyaRS", v: 64513, pv: 1, checkEmpty: false, from: "localhost:33333", fromId: 2 }
2014-04-16T12:25:57.065-0400 [conn1589] command: { replSetHeartbeat: "asyaRS", v: 64513, pv: 1, checkEmpty: false, from: "localhost:33333", fromId: 2 }
2014-04-16T12:25:57.065-0400 [conn1589] command admin.$cmd command: replSetHeartbeat { replSetHeartbeat: "asyaRS", v: 64513, pv: 1, checkEmpty: false, from: "localhost:33333", fromId: 2 } ntoreturn:1 keyUpdates:0 numYields:0  reslen:149 0ms
2014-04-16T12:25:57.140-0400 [rsHealthPoll] replset info localhost:27017 thinks that we are down
2014-04-16T12:25:57.141-0400 [rsHealthPoll] replSet member localhost:27017 is up
2014-04-16T12:25:57.141-0400 [rsHealthPoll] replSet member localhost:27017 is now in state STARTUP
2014-04-16T12:25:58.284-0400 [conn1590] run command admin.$cmd { replSetHeartbeat: "asyaRS", v: 64513, pv: 1, checkEmpty: false, from: "localhost:27017", fromId: 0 }
2014-04-16T12:25:58.284-0400 [conn1590] command: { replSetHeartbeat: "asyaRS", v: 64513, pv: 1, checkEmpty: false, from: "localhost:27017", fromId: 0 }
2014-04-16T12:25:58.284-0400 [conn1590] command admin.$cmd command: replSetHeartbeat { replSetHeartbeat: "asyaRS", v: 64513, pv: 1, checkEmpty: false, from: "localhost:27017", fromId: 0 } ntoreturn:1 keyUpdates:0 numYields:0  reslen:149 0ms
2014-04-16T12:25:58.285-0400 [conn1590] run command admin.$cmd { replSetFresh: 1, set: "asyaRS", opTime: new Date(6002926844448342017), who: "localhost:27017", cfgver: 64513, id: 0 }
2014-04-16T12:25:58.285-0400 [conn1590] command: { replSetFresh: 1, set: "asyaRS", opTime: new Date(6002926844448342017), who: "localhost:27017", cfgver: 64513, id: 0 }
2014-04-16T12:25:58.286-0400 [conn1590] command admin.$cmd command: replSetFresh { replSetFresh: 1, set: "asyaRS", opTime: new Date(6002926844448342017), who: "localhost:27017", cfgver: 64513, id: 0 } ntoreturn:1 keyUpdates:0 numYields:0  reslen:125 0ms
2014-04-16T12:25:58.303-0400 [conn1590] run command admin.$cmd { replSetFresh: 1, set: "asyaRS", opTime: new Date(6002926844448342017), who: "localhost:27017", cfgver: 64513, id: 0 }
2014-04-16T12:25:58.303-0400 [conn1590] command: { replSetFresh: 1, set: "asyaRS", opTime: new Date(6002926844448342017), who: "localhost:27017", cfgver: 64513, id: 0 }
2014-04-16T12:25:58.303-0400 [conn1590] command admin.$cmd command: replSetFresh { replSetFresh: 1, set: "asyaRS", opTime: new Date(6002926844448342017), who: "localhost:27017", cfgver: 64513, id: 0 } ntoreturn:1 keyUpdates:0 numYields:0  reslen:125 0ms
2014-04-16T12:25:59.067-0400 [conn1589] run command admin.$cmd { replSetHeartbeat: "asyaRS", v: 64513, pv: 1, checkEmpty: false, from: "localhost:33333", fromId: 2 }
2014-04-16T12:25:59.067-0400 [conn1589] command: { replSetHeartbeat: "asyaRS", v: 64513, pv: 1, checkEmpty: false, from: "localhost:33333", fromId: 2 }
2014-04-16T12:25:59.067-0400 [conn1589] command admin.$cmd command: replSetHeartbeat { replSetHeartbeat: "asyaRS", v: 64513, pv: 1, checkEmpty: false, from: "localhost:33333", fromId: 2 } ntoreturn:1 keyUpdates:0 numYields:0  reslen:149 0ms
2014-04-16T12:25:59.142-0400 [rsHealthPoll] replSet member localhost:27017 is now in state SECONDARY
2014-04-16T12:26:00.286-0400 [conn1590] run command admin.$cmd { replSetHeartbeat: "asyaRS", v: 64513, pv: 1, checkEmpty: false, from: "localhost:27017", fromId: 0 }
2014-04-16T12:26:00.286-0400 [conn1590] command: { replSetHeartbeat: "asyaRS", v: 64513, pv: 1, checkEmpty: false, from: "localhost:27017", fromId: 0 }
2014-04-16T12:26:00.286-0400 [conn1590] command admin.$cmd command: replSetHeartbeat { replSetHeartbeat: "asyaRS", v: 64513, pv: 1, checkEmpty: false, from: "localhost:27017", fromId: 0 } ntoreturn:1 keyUpdates:0 numYields:0  reslen:149 0ms
2014-04-16T12:26:01.068-0400 [conn1589] run command admin.$cmd { replSetHeartbeat: "asyaRS", v: 64513, pv: 1, checkEmpty: false, from: "localhost:33333", fromId: 2 }
2014-04-16T12:26:01.068-0400 [conn1589] command: { replSetHeartbeat: "asyaRS", v: 64513, pv: 1, checkEmpty: false, from: "localhost:33333", fromId: 2 }
2014-04-16T12:26:01.068-0400 [conn1589] command admin.$cmd command: replSetHeartbeat { replSetHeartbeat: "asyaRS", v: 64513, pv: 1, checkEmpty: false, from: "localhost:33333", fromId: 2 } ntoreturn:1 keyUpdates:0 numYields:0  reslen:149 0ms
2014-04-16T12:26:02.288-0400 [conn1590] run command admin.$cmd { replSetHeartbeat: "asyaRS", v: 64513, pv: 1, checkEmpty: false, from: "localhost:27017", fromId: 0 }
2014-04-16T12:26:02.288-0400 [conn1590] command: { replSetHeartbeat: "asyaRS", v: 64513, pv: 1, checkEmpty: false, from: "localhost:27017", fromId: 0 }
2014-04-16T12:26:02.288-0400 [conn1590] command admin.$cmd command: replSetHeartbeat { replSetHeartbeat: "asyaRS", v: 64513, pv: 1, checkEmpty: false, from: "localhost:27017", fromId: 0 } ntoreturn:1 keyUpdates:0 numYields:0  reslen:149 0ms
2014-04-16T12:26:03.070-0400 [conn1589] run command admin.$cmd { replSetHeartbeat: "asyaRS", v: 64513, pv: 1, checkEmpty: false, from: "localhost:33333", fromId: 2 }
2014-04-16T12:26:03.070-0400 [conn1589] command: { replSetHeartbeat: "asyaRS", v: 64513, pv: 1, checkEmpty: false, from: "localhost:33333", fromId: 2 }
2014-04-16T12:26:03.070-0400 [conn1589] command admin.$cmd command: replSetHeartbeat { replSetHeartbeat: "asyaRS", v: 64513, pv: 1, checkEmpty: false, from: "localhost:33333", fromId: 2 } ntoreturn:1 keyUpdates:0 numYields:0  reslen:149 0ms
2014-04-16T12:26:04.287-0400 [conn1590] run command admin.$cmd { replSetFresh: 1, set: "asyaRS", opTime: new Date(6002926844448342017), who: "localhost:27017", cfgver: 64513, id: 0 }
2014-04-16T12:26:04.287-0400 [conn1590] command: { replSetFresh: 1, set: "asyaRS", opTime: new Date(6002926844448342017), who: "localhost:27017", cfgver: 64513, id: 0 }
2014-04-16T12:26:04.287-0400 [conn1590] command admin.$cmd command: replSetFresh { replSetFresh: 1, set: "asyaRS", opTime: new Date(6002926844448342017), who: "localhost:27017", cfgver: 64513, id: 0 } ntoreturn:1 keyUpdates:0 numYields:0  reslen:70 0ms
2014-04-16T12:26:04.288-0400 [conn1590] run command admin.$cmd { replSetElect: 1, set: "asyaRS", who: "localhost:27017", whoid: 0, cfgver: 64513, round: ObjectId('534eaf1c89732754cb5d8646') }
2014-04-16T12:26:04.289-0400 [conn1590] command: { replSetElect: 1, set: "asyaRS", who: "localhost:27017", whoid: 0, cfgver: 64513, round: ObjectId('534eaf1c89732754cb5d8646') }
2014-04-16T12:26:04.289-0400 [conn1590] replSet received elect msg { replSetElect: 1, set: "asyaRS", who: "localhost:27017", whoid: 0, cfgver: 64513, round: ObjectId('534eaf1c89732754cb5d8646') }
2014-04-16T12:26:04.289-0400 [conn1590] replSet info voting yea for localhost:27017 (0)
2014-04-16T12:26:04.289-0400 [conn1590] command admin.$cmd command: replSetElect { replSetElect: 1, set: "asyaRS", who: "localhost:27017", whoid: 0, cfgver: 64513, round: ObjectId('534eaf1c89732754cb5d8646') } ntoreturn:1 keyUpdates:0 numYields:0  reslen:66 0ms
2014-04-16T12:26:04.289-0400 [conn1590] run command admin.$cmd { replSetHeartbeat: "asyaRS", v: 64513, pv: 1, checkEmpty: false, from: "localhost:27017", fromId: 0 }
2014-04-16T12:26:04.290-0400 [conn1590] command: { replSetHeartbeat: "asyaRS", v: 64513, pv: 1, checkEmpty: false, from: "localhost:27017", fromId: 0 }
2014-04-16T12:26:04.290-0400 [conn1590] command admin.$cmd command: replSetHeartbeat { replSetHeartbeat: "asyaRS", v: 64513, pv: 1, checkEmpty: false, from: "localhost:27017", fromId: 0 } ntoreturn:1 keyUpdates:0 numYields:0  reslen:149 0ms
2014-04-16T12:26:05.071-0400 [conn1589] Socket recv() conn closed? 127.0.0.1:63559
2014-04-16T12:26:05.072-0400 [conn1589] SocketException: remote: 127.0.0.1:63559 error: 9001 socket exception [CLOSED] server [127.0.0.1:63559]
2014-04-16T12:26:05.073-0400 [conn1591] run command admin.$cmd { replSetHeartbeat: "asyaRS", v: 64513, pv: 1, checkEmpty: false, from: "localhost:33333", fromId: 2 }
2014-04-16T12:26:05.073-0400 [conn1591] command: { replSetHeartbeat: "asyaRS", v: 64513, pv: 1, checkEmpty: false, from: "localhost:33333", fromId: 2 }
2014-04-16T12:26:05.074-0400 [conn1591] command admin.$cmd command: replSetHeartbeat { replSetHeartbeat: "asyaRS", v: 64513, pv: 1, checkEmpty: false, from: "localhost:33333", fromId: 2 } ntoreturn:1 keyUpdates:0 numYields:0  reslen:149 0ms
2014-04-16T12:26:05.148-0400 [rsHealthPoll] replSet member localhost:27017 is now in state PRIMARY

I think if votes:0 nodes can call for election and can become primaries it becomes really difficult to reason about majorities and various failure scenarios.



 Comments   
Comment by Asya Kamsky [ 18/Apr/14 ]

I think the issue is more with figuring out what majority means when every node does not have votes:1.

Comment by Scott Hernandez (Inactive) [ 17/Apr/14 ]

Why do you think that votes:0 nodes shouldn't be able to start elections? Why were you under the impression that wasn't how it worked?

Is there some failure/problem which would be caused by the fact that the election events are unrelated to the nodes' votes?

Generated at Thu Feb 08 03:32:20 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.