[SERVER-69905] Improve ServiceExecutorSynchronous worker threads Created: 22/Sep/22  Updated: 20/Jan/23

Status: Backlog
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Improvement Priority: Major - P3
Reporter: Billy Donahue Assignee: Backlog - Service Architecture
Resolution: Unresolved Votes: 0
Labels: None
Σ Remaining Estimate: Not Specified Remaining Estimate: Not Specified
Σ Time Spent: Not Specified Time Spent: Not Specified
Σ Original Estimate: Not Specified Original Estimate: Not Specified

Issue Links:
Related
related to SERVER-70151 ServiceExecutorSynchronous thread_loc... Closed
is related to SERVER-64594 measure executor task stack high wate... Backlog
is related to SERVER-64584 Don't hard-code the connection thread... Backlog
is related to SERVER-30738 Implement stdx::thread with control o... Closed
Sub-Tasks:
Key
Summary
Type
Status
Assignee
SERVER-70145 reuse threads in ServiceExecutorSynch... Sub-task Closed Billy Donahue  
SERVER-71760 Don't detach service worker threads Sub-task Open Backlog - Service Architecture  
Assigned Teams:
Service Arch
Sprint: Service Arch 2022-12-26, Service Arch 2022-10-17, Service Arch 2022-11-14, Service Arch 2022-11-28, Service Arch 2022-12-12, Service Arch 2023-01-09
Participants:

 Description   

ServiceExecutorSynchronous creates worker threads when a client calls schedule on it to initiate a task chain. If the task calls schedule while it's executing, that task is enqueued to the worker thread. The worker stops when the queue is empty.

Issues with this:

  • Uses pthread instead of stdx::thread, so we have two kinds of threads in the system. This is done to control the stack size, but we should be okay to just use the default stack size on 64-bit systems. The stack size is tweaked as a canary for finding deep stack usage in tests, but I think this benefit is very small and we could measure high water mark in other ways.
  • Worker threads are detached, which makes synchronization and thread_local management very difficult. It also means they can run during process exit and can cause interference between unit test scenarios. Detached threads should be avoided if possible. We're already trying to synchronize on the end of these threads' callable function in order to update stats and join the executor, so detaching is kind of pointless.
  • (sub-task SERVER-70145) The biggest issue is that this executor makes a fresh thread for each toplevel task. The contract of this scheduler is that a toplevel task gets a dedicated thread, but it is not promising a FRESH brand new thread, only a dedicated one. So when the toplevel task completes, the worker thread should be reusable. But we don't reuse it. We let it exit and just make new threads for each connection. This seems extremely wasteful and we should just pool these workers and hand out leases to them. Essentially this is not so different from an ordinary ThreadPool except that the threads are different kinds of objects from the stdx::thread objects managed by ThreadPool.
  • We could be using boost::thread for the stack size control instead of raw pthreads.


 Comments   
Comment by Billy Donahue [ 30/Nov/22 ]

The min and max worker bucket idea (subtask SERVER-70145 ) was done as part of SERVER-70151, because without it, that ticket is a net speed penalty. This optimization is easy to do and buys back the 2x slowdown with a 10X speedup, so net impact should be something like a 5X speedup.

Comment by Billy Donahue [ 31/Oct/22 ]

Consider the discussion about establishing a min and max worker bucket sizes for the SESynchronous from this code review for SERVER-70151, which was left as follow-up work. https://github.com/10gen/mongo/pull/7886#discussion_r1005397198

Generated at Thu Feb 08 06:14:45 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.