[SERVER-57292] Introduce onStepDown cancelation token to fix shutdown memory leak Created: 28/May/21  Updated: 06/Dec/22  Resolved: 28/May/21

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Improvement Priority: Major - P3
Reporter: Andrew Shuvalov (Inactive) Assignee: Backlog - Service Architecture
Resolution: Won't Fix Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
is depended on by SERVER-67077 TransactionCoordinator uses both the ... Open
Assigned Teams:
Service Arch
Participants:
Linked BF Score: 0

 Description   

The motivation for this is to make a clean refactoring of the executor usage for BF-18913 and SERVER-51317. We do not cancel pending tasks in the transaction coordinator and they show up as memory leak.

Switching the transaction coordinator to use AsyncWorkScheduler for all tasks is not possible because it can block for a while. Right now we rely on the fact that Grid pool executor is not shutting down as explained in SERVER-51316.

We cannot use any global flag like globalInShutdownDeprecated() because it breaks the termination sequence.

Proposed solution:

Add one or several cancelation tokens specifically for onStepDown to be used in src/mongo/db/commands/shutdown_d.cpp and cancel them in proper order from stepDownForShutdown() and beginShutdown().

When the foundation is in I will wire the proper token to be used in the transaction coordinator as part of working on SERVER-51317.



 Comments   
Comment by Andrew Shuvalov (Inactive) [ 28/May/21 ]

We spoke with matthew.saltz and we'll try to avoid doing that. Details will be in the BF

Generated at Thu Feb 08 05:41:28 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.