[SERVER-76621] Thread pool task executor can cause memory leak when handling exhaust command. Created: 27/Apr/23 Updated: 29/Oct/23 Resolved: 02/Jun/23 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | 7.1.0-rc0, 6.0.7, 7.0.0-rc3 |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Suganthi Mani | Assignee: | Jason Chan |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||
| Operating System: | ALL | ||||||||
| Backport Requested: |
v7.0, v6.0
|
||||||||
| Sprint: | Service Arch 2023-05-15, Service Arch 2023-05-29, Service Arch 2023-06-12 | ||||||||
| Participants: | |||||||||
| Linked BF Score: | 0 | ||||||||
| Description |
|
Thread pool task executor when it handles exhaust commands can leak resources. We noticed the leakage with SingleServerDiscoveryMonitor which is the only consumer at this moment which uses thread pool task executor to schedule exhaust commands . Below is the racy sequence with exhaust command workflow. 2. Thread 2: Cancels the exhaust command which will cancel the cbState and resets the callback in cbstate to release any resources the callback is holding. 3. Thread 1: Thread resumes, then acquires the executor mutex lock and sets the callback in cbState back to the original callback function. And, this can cause memory leak. In the BF, we were ending up with indirect leak due to this circular reference: SingleServerDiscoveryMonitor holds a reference to TaskExecutor::CallbackHandle and which in-turn holds a reference to ThreadPoolTaskExecutor::CallbackState and that contains CallbackFn which holds a reference to SingleServerDiscoveryMonitor. |
| Comments |
| Comment by Githook User [ 07/Jun/23 ] |
|
Author: {'name': 'Jason Chan', 'email': 'jason.chan@mongodb.com', 'username': 'jasonjhchan'}Message: (cherry picked from commit 00ae304e83e4e2091262fcda04fa2bc8b8075bbd) |
| Comment by Githook User [ 05/Jun/23 ] |
|
Author: {'name': 'Jason Chan', 'email': 'jason.chan@mongodb.com', 'username': 'jasonjhchan'}Message: (cherry picked from commit 00ae304e83e4e2091262fcda04fa2bc8b8075bbd) |
| Comment by Githook User [ 01/Jun/23 ] |
|
Author: {'name': 'Jason Chan', 'email': 'jason.chan@mongodb.com', 'username': 'jasonjhchan'}Message: |
| Comment by Suganthi Mani [ 27/Apr/23 ] |
|
Two ways to fix the problem: 1) The fix is to run this block in a single critical section 2) Break the cycle, by reseting the handles after shutting down the singleServerDiscoveryMonitor. I prefer fix#1 as I feel fix#2 is sub-optimal as there are other ways to cancel the requests scheduled in thread pool task executor. To be noted, we can also get into similar leak problem when callback to last response in the exhaust cursor races with callback to previous response. I see lot of fragile places in thread pool task executor, example, if blocks like this where we may do some work thinking that if check is still valid. |