-
Type: Bug
-
Resolution: Done
-
Priority: Major - P3
-
Affects Version/s: None
-
Component/s: Replication
-
None
-
Fully Compatible
-
ALL
-
Repl B (10/30/15)
-
0
In jstests/replsets/tags.js, among the 5 nodes, node 1 has priority 2 and node 2 has priority 3. If the following scenario, lower-priority node 1 will steal the primary from node 2 until node 2 takes over again.
1. Node 0 becomes the primary in term 1 at the beginning.
2. Node 2 starts a new election in term 2 because it has a higher priority than Node 0.
3. Node 1 gets the vote request and votes yes. It updates its term to 2.
4. Node 1 considers starting a new election because Node 0 is still the legal primary with a lower priority. Node 1 schedules a take-over in several seconds later.
5. Node 2 gathers enough votes and announces its win.
6. On Node 1, the scheduled take-over happens and steals the primary.
If step 4 happens before step 3, everything's fine since term update will cancel priority take-over. If step 4 happens after step 5, it's also fine, because Node 1 won't stand up for election after knowing Node 2 is the new primary. Usually, the window between step 2 and step 4 is several milliseconds, but it's still possible.
To solve this problem, we could schedule the take-over only if the current primary is in the latest term I know, preventing step 3 from happening. In other words, if the replset is not stable, a node won't try to take over the primary.