[SERVER-56223] Exception in PrimaryOnlyService::_rebuildService() are silently ignored Created: 21/Apr/21  Updated: 06/Dec/22

Status: Backlog
Project: Core Server
Component/s: None
Affects Version/s: Backlog
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Tommaso Tocci Assignee: Backlog - Service Architecture
Resolution: Unresolved Votes: 0
Labels: sa-remove-fv-backlog-22
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Assigned Teams:
Service Arch
Operating System: ALL
Participants:
Story Points: 2

 Description   

On step up we rebuild the primary only services by invoking:

  1. PrimaryOnlyService::_rebuildService()
  2. PrimaryOnlyService::_rebuildInstances(term)

For the exceptions thrown in the specific try/catch block of (2) we correctly set the POS _state to kRebuildFailed and the _rebuildStatus with the error occurred.

For all the other exceptions thrown in either (1) or in (2) but outside that try/catch block we simply ignore the error. If this happens the POS will remain in the kRebuilding state until the next stepUp, in the meantime all the attempt of creating new instances of that service will simply hang.

My proposal is to catch all the exceptions thrown in the rebuild future chain and to properly set the _state and the _rebuildStatus accordingly to the error.

Acceptance criteria: 

Design a solution for this and provide a more informed LOE. 

 


Generated at Thu Feb 08 05:38:42 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.