[SERVER-29030] Announce new primary via heartbeat requests Created: 01/May/17 Updated: 30/Oct/23 Resolved: 18/May/20 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Replication |
| Affects Version/s: | None |
| Fix Version/s: | 4.7.0, 4.4.4 |
| Type: | Improvement | Priority: | Major - P3 |
| Reporter: | Siyuan Zhou | Assignee: | William Schultz (Inactive) |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | former-quick-wins | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Attachments: |
|
||||||||||||||||||||||||||||||||||||||||
| Issue Links: |
|
||||||||||||||||||||||||||||||||||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||||||||||||||||||||||||||||||||||
| Backport Requested: |
v4.4
|
||||||||||||||||||||||||||||||||||||||||
| Sprint: | Repl 2020-02-10, Repl 2020-05-04, Repl 2020-05-18, Repl 2020-06-01 | ||||||||||||||||||||||||||||||||||||||||
| Participants: | |||||||||||||||||||||||||||||||||||||||||
| Linked BF Score: | 25 | ||||||||||||||||||||||||||||||||||||||||
| Description |
|
We probably should include replset metadata in heartbeat requests so that the new primary sending a round of heartbeats will let others know it's the primary as soon as it wins an election instead of through heartbeat responses or the data replication spanning tree. |
| Comments |
| Comment by Githook User [ 14/Jan/21 ] | |||||||||||||||||||||
|
Author: {'name': 'William Schultz', 'email': 'william.schultz@mongodb.com', 'username': 'will62794'}Message: (cherry picked from commit 912735588386424712a1525da1574a4554bf1787) | |||||||||||||||||||||
| Comment by Githook User [ 18/May/20 ] | |||||||||||||||||||||
|
Author: {'name': 'William Schultz', 'email': 'william.schultz@mongodb.com', 'username': 'will62794'}Message: | |||||||||||||||||||||
| Comment by Lingzhi Deng [ 29/Apr/20 ] | |||||||||||||||||||||
|
After exhaust oplog fetching project in >= 4.4, the syncing nodes no longer send getMore commands to the sync source and thus the replication term will no longer be propagated up the spanning tree as it used to. So it might be also interesting to test how fast an old primary would step down after stepping up the new primary in the case where there is a network partition between the old and the new primary and the third node is initially syncing from the old primary. | |||||||||||||||||||||
| Comment by William Schultz (Inactive) [ 29/Apr/20 ] | |||||||||||||||||||||
|
Interestingly, it looks like secondaries on versions >= 4.4 already end up learning of a new primary very quickly as a byproduct of the new behavior where we do a reconfig on step up introduced in
The secondary is effectively learning of the new primary instantaneously. If we look at filtered logs from one of these runs, we can understand why:
After 35021 gets elected primary, it increments its config term via reconfig, and will also re-schedule heartbeats, which we do on every reconfig. When the secondary 35022 receives one of these new heartbeats with a newer config than its own, it will also immediately schedule a heartbeat to the new primary. In response to this immediate heartbeat it will learn of the new primary's state, which we see in the logs with the "Member is in new state" messages. We can further verify this by removing the reconfig on step up and measuring the time to learn of the new primary. If we comment out this code block and run the same test for 5 runs, we see the following stats:
which seems to make it clear that learning of a new primary can be bottlenecked on the heartbeat interval (2000ms). Even though it appears that in practice (based on these simple tests, at least), secondaries may not be slow to learn of a new primary in versions >= 4.4, I think it's probably still valuable to implement this change, since we don't want to rely on the unrelated reconfig on step up behavior to ensure that secondaries learn of new primaries quickly. Also, it can be a valuable improvement for older branches where we don't do this reconfig i.e. v4.2, v4.0, etc. | |||||||||||||||||||||
| Comment by Siyuan Zhou [ 28/Apr/20 ] | |||||||||||||||||||||
|
william.schultz the alternative sounds good to me. | |||||||||||||||||||||
| Comment by William Schultz (Inactive) [ 27/Apr/20 ] | |||||||||||||||||||||
|
siyuan.zhou The original (reverted) implementation of this ticket added a 'primaryId' field to heartbeat requests and changed the behavior so that we would restart heartbeats upon receiving a heartbeat request from a node N whose 'primaryId' is N and is different from the receiving node's current view of the primary i.e. its 'currentPrimaryIndex'. An alternative that seems a potentially more focused solution would be to directly update our view of the current primary when receiving such a heartbeat request instead of re-scheduling all heartbeats. I think that we should only update our view of the primary if the sender node is primary and the term of the heartbeat request is >= our own, to prevent updating our 'currentPrimaryIndex' to a stale primary. Did you have any other thoughts on an alternative solution here? | |||||||||||||||||||||
| Comment by Githook User [ 13/Feb/20 ] | |||||||||||||||||||||
|
Author: {'username': 'rtimmons', 'name': 'Ryan Timmons', 'email': 'ryan.timmons@mongodb.com'}Message: Revert " This reverts commit 994fdd99bb6adb2cf9c7dd4061c2035188c2c8da. | |||||||||||||||||||||
| Comment by Githook User [ 12/Feb/20 ] | |||||||||||||||||||||
|
Author: {'name': 'Ryan Timmons', 'username': 'rtimmons', 'email': 'ryan.timmons@mongodb.com'}Message: | |||||||||||||||||||||
| Comment by Siyuan Zhou [ 15/Oct/19 ] | |||||||||||||||||||||
|
We could also have secondaries restart heartbeat immediately on receiving heartbeats from the new primary as we do for other cases. |