[SERVER-39777] step down nodes with a high freeze timeout before validating them on shutdown Created: 22/Feb/19 Updated: 29/Oct/23 Resolved: 03/Apr/19 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Replication, Testing Infrastructure |
| Affects Version/s: | None |
| Fix Version/s: | 4.1.10, 4.0.13 |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Judah Schvimer | Assignee: | Judah Schvimer |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||||||||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||||||||||||||||||
| Operating System: | ALL | ||||||||||||||||||||||||
| Backport Requested: |
v4.0
|
||||||||||||||||||||||||
| Sprint: | Repl 2019-03-11, Repl 2019-03-25, Repl 2019-04-08 | ||||||||||||||||||||||||
| Participants: | |||||||||||||||||||||||||
| Linked BF Score: | 23 | ||||||||||||||||||||||||
| Description |
|
If a node is able to step down mid-validation, then it will fail the validation. Freezing secondary nodes and stepping down primary nodes with a high freeze timeout should fix this. |
| Comments |
| Comment by Githook User [ 13/Aug/19 ] |
|
Author: {'name': 'Judah Schvimer', 'username': 'judahschvimer', 'email': 'judah@mongodb.com'}Message: (cherry picked from commit ab99966275dce28a052446be4c70a500956f507b) |
| Comment by Githook User [ 03/Apr/19 ] |
|
Author: {'name': 'Judah Schvimer', 'username': 'judahschvimer', 'email': 'judah@mongodb.com'}Message: |
| Comment by Max Hirschhorn [ 26/Feb/19 ] |
|
FWIW, validate_collections_on_shutdown.js intended to use the network retry logic in command_sequence_with_retries.js to tolerate stepdowns by simply retrying the listDatabases or validate commands. Forcing the stepdown to happen sooner (and not letting it step back up) sounds like a more reliable approach than trying to inspect the raw response to the validate command and seeing if it failed with a InterruptedDueToStepDown error response to retry it. |