[SERVER-78958] Setting up TLConnection concurrently with shutdown can result in leaking a ConnectionPool Created: 13/Jul/23  Updated: 07/Dec/23

Status: Open
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: George Wangensteen Assignee: Backlog - Service Architecture
Resolution: Unresolved Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
is related to SERVER-81456 Suppress leak sanitizer failures for ... Closed
Assigned Teams:
Service Arch
Operating System: ALL
Sprint: Service Arch Prioritized List, Service Arch 2023-09-18
Participants:
Linked BF Score: 122

 Description   

TLConnection::setup takes a callback as argument, that is produced via ConnectionPool::SpecificPool::guardCallback. That producer-function puts a shared_ptr to the SpecificPool in the lambda-capture-list for the returned callback, which means the SpecificPool lifetime will be extended to the lifetime of the callback. Similarly, since the SpecificPool owns a shared_ptr to the parent ConnectionPool, it's lifetime will likewise be extended. 

 

That callback is scheduled here to run after the TLConnection is set-up or a timer expires. However, if the server shuts-down before this process completes, it will cancel the timer, which results in the timer failing to ready the future that results in the callback being scheduled due to an early return here  . And nothing in the code insists that the final continuation of AsyncDBClient::connect from TLConnection::setup runs before process-shutdown, which means that the callback may never be scheduled and run. If the callback is therefore leaked, it leaks a shared_ptr to a SpecificPool, which in turn results in a ConnectionPool being leaked. 



 Comments   
Comment by Jason Chan [ 14/Sep/23 ]

We think this leak has existed in older versions and the solution right now is a bit more involved. We should consider suppressing the sanitizer failure for now to avoid generating more BFs.

Generated at Thu Feb 08 06:39:43 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.