[SERVER-84281] Internal task _reconfigToRemoveNewlyAddedField() should be resilient to stepdown errors. Created: 18/Dec/23  Updated: 19/Dec/23  Resolved: 19/Dec/23

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Suganthi Mani Assignee: Backlog - Replication Team
Resolution: Works as Designed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
Assigned Teams:
Replication
Operating System: ALL
Participants:
Linked BF Score: 142

 Description   

Due to SERVER-70127, system operations are now killable on stepdown by default. This change means that internal operations like _reconfigToRemoveNewlyAddedField() can be interrupted by stepdown (see this code which can throw interruption errors) , potentially resulting in the throwing of InterruptedDueToReplStateChange error code. Unfortunately, _reconfigToRemoveNewlyAddedField don't catch these errors. Additionally, since this internal task runs on the internal thread without error handling, it can cause a server crash.



 Comments   
Comment by Suganthi Mani [ 19/Dec/23 ]

Apologies for the confusion. I realized that we replExecutor that runs _reconfigToRemoveNewlyAddedField(), has the flag _systemOperationKillable set to false. So, the problem mentioned here can't happen. The BF-31175 problem is due to a different issue.

Generated at Thu Feb 08 06:54:34 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.