[SERVER-84361] Check that CreateCollectionCoordinatorLegacy terminates on error Created: 21/Dec/23 Updated: 22/Jan/24 Resolved: 22/Jan/24 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Task | Priority: | Major - P3 |
| Reporter: | Romans Kasperovics | Assignee: | Pol Pinol |
| Resolution: | Works as Designed | Votes: | 0 |
| Labels: | car-investigation | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Assigned Teams: |
Catalog and Routing
|
| Sprint: | CAR Team 2024-01-22 |
| Participants: |
| Description |
|
Check that throwing in |
| Comments |
| Comment by Pol Pinol [ 22/Jan/24 ] |
|
I ran a custom test to trigger an exception on the first phase of the legacy coordinator - without buildPhaseHandler, and compared the results with throwing an exception in the following phase kCommit, which is registered by the buildPhaseHandler. When throwing an exception, both runs are followed by the .onCompletion phase of the future chain. After that, as they must return an error, a cleanup is performed. We can see these traces of logs in both runs, which confirm that we are releasing resources. The only difference between both runs is where the instance of the coordinator is removed from the registry (config.system.sharding_ddl_coordinators). If it has thrown in the first phase, without executing the buildPhaseHandler, the coordinator document will not exist, and this will be executed to remove the instance. On the other hand, if the buildPhaseHandler has been executed, we will delete the coordinator document, and the PrimaryOnlyServiceOpObserver will be responsible for removing the instance from the registry. Finally, both runs release the remaining resources, i.e. DDL locks. To summarize, although they are using different implementations for removing resources, I don’t see a place where throwing without installing the coord document can lead to zombie coordinator instances. If there are no other concerns, I'm closing this ticket. |