[SERVER-36817] replSetFreeze command run by stepdown thread may fail when server is already primary Created: 23/Aug/18 Updated: 29/Oct/23 Resolved: 27/Aug/18 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Testing Infrastructure |
| Affects Version/s: | None |
| Fix Version/s: | 3.6.10, 4.0.6, 4.1.3 |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Max Hirschhorn | Assignee: | Jonathan Abrahams |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | tig-resmoke | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||||||||||||||
| Operating System: | ALL | ||||||||||||||||||||
| Backport Requested: |
v4.0, v3.6
|
||||||||||||||||||||
| Sprint: | TIG 2018-09-10 | ||||||||||||||||||||
| Participants: | |||||||||||||||||||||
| Linked BF Score: | 0 | ||||||||||||||||||||
| Story Points: | 1 | ||||||||||||||||||||
| Description |
|
As part of the changes to address We either need to handle the OperationFailure: cannot freeze node when primary or running for election. state: Primary exception or prevent it from occurring. |
| Comments |
| Comment by Githook User [ 22/Dec/18 ] |
|
Author: {'username': 'hptabster', 'email': 'jonathan@mongodb.com', 'name': 'Jonathan Abrahams'}Message: (cherry picked from commit 0c0a4acea4a1c7bb579f5aaaa89a6f1545cf22ef) |
| Comment by Githook User [ 22/Dec/18 ] |
|
Author: {'username': 'hptabster', 'email': 'jonathan@mongodb.com', 'name': 'Jonathan Abrahams'}Message: (cherry picked from commit 0c0a4acea4a1c7bb579f5aaaa89a6f1545cf22ef) |
| Comment by Githook User [ 27/Aug/18 ] |
|
Author: {'name': 'Jonathan Abrahams', 'email': 'jonathan@mongodb.com', 'username': 'hptabster'}Message: |
| Comment by Max Hirschhorn [ 23/Aug/18 ] |
Allowing a node to step back up on its own violates the principle of ensuring the stepdown thread is in complete control over which node is primary at any moment. I'd be in favor of removing the stepdown_duration_secs configuration option and instead having it always be a very long time (e.g. 24 hours) so that the stepdown thread must run the replSetStepUp command for a node to ever become primary. CC judah.schvimer If we go down this found, then I think leave cleaning up the exception handling for the replSetStepUp command to |