[SERVER-53761] Determine strategy to make the ServiceEntryPointCommon more asynchronous Created: 13/Jan/21  Updated: 05/Jan/23

Status: Blocked
Project: Core Server
Component/s: Performance
Affects Version/s: None
Fix Version/s: None

Type: Improvement Priority: Major - P3
Reporter: Andrew Shuvalov (Inactive) Assignee: Backlog - Service Architecture
Resolution: Unresolved Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
depends on SERVER-52783 [cleanup] Make tenant_migration_donor... Closed
Related
related to SERVER-54208 Remove blocking call to Fetcher::join... Closed
Assigned Teams:
Service Arch
Sprint: Sharding 2021-02-22, Sharding 2021-03-08
Participants:

 Description   

Background: this is the follow up to https://jira.mongodb.org/browse/SERVER-53505 trying to prevent the service entry point from blocking waiting for the tenant migration critical section blocker.

The motivation to fix is is based on the anticipated outage when a tenant whose database is migrated to another replica set is sending too many clustered reads that are blocked by the migration critical section. The executor serving client reads will run out of free threads and/or create too many threads to accommodate the thread unavailability.

Besides the blocker, we also have a prototype for asynchronous execution in runAsync, however it also returns a synchronous future, which will not unblock the thread when it's waiting.

At the end, the service command invocation does something like `handleRequest().get()` like here. No matter how asynchronous the processing is, the `get()` is still blocking the thread.

This ticket is to discuss what we can do about that.



 Comments   
Comment by Andrew Shuvalov (Inactive) [ 07/Nov/22 ]

This presentation explored the topic. I think there is an easy path forward.

Comment by Andrew Shuvalov (Inactive) [ 14/Jan/21 ]

We had a discussion with amirsaman.memaripour and matthew.saltz discussing the problem of blocking code on the request path. At this point we have several instances of blocking (e.g. authentication), which is one of the roadblocks from enabling new threading model in production. New threading model: capped processing threadpool, old model: a thread per connection, which is definitely not scalable and hard to control.

The main problem with more asynchronous processing is performance penalty. It is believed that the refactoring that was already done costs 5% in latency. At the same time every non-asynchronous block carries the danger of creating a live-lock, when we may run out of threads for new requests.

For this I should comment that in general the cost of breaking and rescheduling a continuation back to pool should cost about ~1 micros if the pool is empty and there is no additional lock contention. Any latency on top of this simply means that the pool was not empty. The penalty is especially high if some of the requests already scheduled to the pool will eventually timeout or will be throttled.

This leads to the topic of insufficient user isolation. The final goal of user isolation is:

  1. Every user receives a fair schedule (the definition of fair should be outside of the scheduler implementation and is defined/coded/configured separately)
  2. Every request admitted to the pool is guaranteed to have sufficient resources for execution. It means it will not be aborted, throttled or timed out. The request that is likely to time out anyway should be throttled back to user immediately

If those conditions are met the rescheduling of continuation back to the pool does not affect the average latency of the system, it simply moves the quant of latency from one request to another. When continuation is rescheduled, whatever request is currently in the front of the queue gets immediate performance boost.

When we implement user isolation properly, we either account for (and limit them) threads used by each user and then we can block if we have to, or we need to account for how many pending continuations in the queue each user already has. I still see the second variant as better one, because allocating threads per user is very unflexible.

About converting the code in service_entry_point_common.cpp from Future to SemiFuture, my understanding is that eventually at least some methods will have to be converted, but we should be careful with that - we reschedule only definitely blocked code, we should always avoid using any non-linear continuation for non blocking and non waiting for some external processing code. I don't fully understand how exactly to do that and perhaps matthew.saltz can shed some light. We will also work on my pending changes for https://jira.mongodb.org/browse/SERVER-53505 offline and perhaps I will get better understanding of those futures tricks once finished.

 

Comment by Andrew Shuvalov (Inactive) [ 13/Jan/21 ]

Before discussing possible actions I would like to find the answers to the following questions:

  1. Before the request enters ServiceEntryPoint::handleRequest(), do we already process the requests in sufficiently asynchronous way so that the threads are not blocked?
  2. Within ServiceEntryPoint::handleRequest(), do we have many code paths besides blocking at tenant migration access blocker?

User isolation questions:

  1. Do we have a centralized point to account for request-related (pending) resource usage for every user? (A blocked or even active thread would be one example of such temporary used resource)

I plan to make a presentation of thread pool usage and related user isolation strategies to discuss how this could be addressed, but I need to figure out what we already have.

Comment by Andrew Shuvalov (Inactive) [ 13/Jan/21 ]

The problems observed with possible refactoring:

  1. The fan out was quite significant
  2. I had to use a hack class InlineExecutor to convert intermediate SemiFuture results back to ExecutorFuture for continuation
  3. We have blocking calls like handleRequest().get() that are blocking the thread for processing even if we convert the actual processing to asynchronous
  4. It also requires to change the src/mongo/transport/service_entry_point.h method handleRequest(), which will make the refactoring of all derived classes necessary
Generated at Thu Feb 08 05:31:47 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.