[SERVER-65498] Recipient config validation may leave ReplicationCoordinator in a bad state Created: 12/Apr/22  Updated: 06/Dec/22  Resolved: 05/May/22

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Matt Broadstone Assignee: [DO NOT USE] Backlog - Server Serverless (Inactive)
Resolution: Won't Do Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Assigned Teams:
Serverless
Operating System: ALL
Participants:

 Description   

When a node receives a heartbeat containing a recipient config, it first checks if the config is valid. A few lines later we print out a warning and bail early if the validation produces a non-OK status. The replication coordinator state when performing this validation is kConfigHBReconfiguring, and bailing early in this case fails to transition the state back to a state which could accept future reconfigs. The effect is that once a node fails this validation, it can no longer receive additional reconfigs in the future.



 Comments   
Comment by Matt Broadstone [ 05/May/22 ]

Fixed in SERVER-66083

Generated at Thu Feb 08 06:02:53 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.