[SERVER-65498] Recipient config validation may leave ReplicationCoordinator in a bad state Created: 12/Apr/22 Updated: 06/Dec/22 Resolved: 05/May/22 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Matt Broadstone | Assignee: | [DO NOT USE] Backlog - Server Serverless (Inactive) |
| Resolution: | Won't Do | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Assigned Teams: |
Serverless
|
| Operating System: | ALL |
| Participants: |
| Description |
|
When a node receives a heartbeat containing a recipient config, it first checks if the config is valid. A few lines later we print out a warning and bail early if the validation produces a non-OK status. The replication coordinator state when performing this validation is kConfigHBReconfiguring, and bailing early in this case fails to transition the state back to a state which could accept future reconfigs. The effect is that once a node fails this validation, it can no longer receive additional reconfigs in the future. |
| Comments |
| Comment by Matt Broadstone [ 05/May/22 ] |
|
Fixed in |