[SERVER-50371] Make a version of TaskExecutor::scheduleRemoteCommand that returns a future Created: 18/Aug/20  Updated: 01/Oct/20  Resolved: 01/Oct/20

Status: Closed
Project: Core Server
Component/s: Internal Code
Affects Version/s: None
Fix Version/s: None

Type: Task Priority: Major - P3
Reporter: Esha Maharishi (Inactive) Assignee: Matthew Saltz (Inactive)
Resolution: Duplicate Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Duplicate
duplicates SERVER-50658 Add support for CancelationTokens to ... Closed
Related
is related to SERVER-50342 Make version of Shard::runCommand tha... Open
is related to SERVER-50341 Make generic whenAll, whenAny, collec... Closed
Sprint: Sharding 2020-10-19
Participants:

 Description   

Currently, TaskExecutor only has a scheduleRemoteCommand interface that takes a callback and returns a callback handle.

As we are structuring more code as a future chain that runs on an executor with a limited number of threads (particularly Primary-Only Service instances), it will be very helpful to have a version of scheduleRemoteCommand that returns a Future.

This, combined with the generic whenAll/whenAny/collect utilities described in SERVER-50341, would be very useful in writing a Primary-Only Service which has steps that send a command to many remote nodes and wait for all of their responses.

Note that SERVER-50342, which is to add a version of Shard::runCommand that returns a Future, would help in sharding code, but would not help for code that runs in standalone replica sets, such as the Primary-Only Services being built for Multitenant Migrations for Serverless (PM-1791).

While this is not an immediately blocking work on PM-1791, since we can work around it by blocking the thread on the executor while waiting for a network response, it does block releasing PM-1791.



 Comments   
Comment by Matthew Saltz (Inactive) [ 01/Oct/20 ]

Doing this work as part of SERVER-50658

Comment by Matthew Saltz (Inactive) [ 24/Sep/20 ]

This particular ticket is really only about an API change. Over the past year or so we've been gradually transitioning from callback-style async programming to using futures, and this ticket is part of the push to really make the codebase futures-friendly.

Comment by Andrew Shuvalov (Inactive) [ 23/Sep/20 ]

Hi matthew.saltz, I was looking at this and let me pass on some trivial observations on this bug:

  1. The std::future interface is clearly missing here but otherwise we should not add more logic to this class, it has enough of responsibilities.
  2. What I would like is to make the scheduleRemoteCommand() to be able to return an immediate failure indicating that the internal queue is full and cannot schedule any more jobs. For this the '
    ThreadPoolTaskExecutor::scheduleIntoPool_inlock()' needs to be refactored from void to Status. 
     The reason for this is when we move to multi-tenant, there will be a need for proper user isolation. There should be no logic related to user isolation in the ThreadPoolTaskExecutor for the reason this logic could be quite complex and different depending on a particular task. The only thing the class that is handling user isolation (an AdmissionController if you will) needs to know is that the Executor is full. This will trigger the logic to penalize the most abusive users and make their jobs to be either canceled or penalized. One good way to do that could be a 'token bucket' that we can discuss separately. The ThreadPoolTaskExecutor should not know anything about users and job priorities, it should have a relatively short queue, much shorter than the per-user queues at the AdmissionController.
Comment by Matthew Saltz (Inactive) [ 23/Sep/20 ]

Assigning myself to this as it looks like it'll make sense to fold the work into SERVER-50658

Comment by Esha Maharishi (Inactive) [ 21/Aug/20 ]

Ah, thanks haley.connelly, that's a fairly concise workaround to simulate a Future-returning scheduleRemoteCommand. We can use this pattern to avoid blocking the thread.

However, I would still advocate for adding a Future-returning scheduleRemoteCommand sooner rather than later since I imagine most of the Primary-Only Services being built right now would benefit from it.

Comment by Haley Connelly [ 19/Aug/20 ]

May be worth noting, we essentially do this inside StatusWith<BSONObj> ReplicaSetNodeProcessInterface::_executeCommandOnPrimary and could extract it into its own method.

Generated at Thu Feb 08 05:22:29 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.