Loading...

XML

Word

Printable

JSON

Type: Bug
Resolution: Duplicate
Priority: Major - P3
Fix Version/s: None
Affects Version/s: None
Component/s: None
Labels:
None

Operating System:
ALL
Sprint:
Service Arch 2022-12-12
Confidence Status:
None
Work Order:
3
CAR Domain/s:
None

Aha! Reference:
None
Tracking Level:
None
Risk Status:
None
Exec Notes:
None
Goal Name(s):
None
Goal Link:
None

ServiceExecutorReserved is designed to allow dealing with spawn failures.

Spawn failures are a temporary error. The OS can fail to spawn for some period of time and regain the spawn ability later on.

In the pre- ~~SERVER-70151~~ ServiceExecutorReserved, schedule() calls would place tasks on a singleton queue, and try to spawn a thread but did not depend on that spawn succeeding.

If there is a period of time in which spawns fail, reserved service executor would still be able to hand incoming scheduled tasks to its established pool of reserved workers. When an idle worker starts its loop iteration by receiving a task, it spawns a new worker to replace itself if worker count is below quota. When it completes a chain of tasks (now called a lease), it decides whether to die or merely go idle, again considering the reserve quota. So workers reproduce only when embarking on a task chain.

The problem:

If spawns fail and reserve is exhausted, tasks will be queued. Suppose the OS recovers and spawns are then possible again. The reserve SvcExec would only find out about it when a reserve thread finishes its task chain and goes idle and spawns.

Review of ~~SERVER-70151~~ discovered this problem but fixing it was out of scope.

Some kind of spawn retry loop initiated when spawn failures occur would probably mitigate the issue.

duplicates

SERVER-70151 ServiceExecutorSynchronous thread_local-related leaks (revert)

Closed

Assignee:: Billy Donahue
Reporter:: Billy Donahue
Participants:: Billy Donahue
Votes:: 0 Vote for this issue
Watchers:: 2 Start watching this issue

Created:: Nov 15 2022 04:37:56 PM UTC
Updated:: Nov 30 2022 03:51:50 PM UTC
Resolved:: Nov 30 2022 03:50:45 PM UTC
Confidence Status Last Update:: 30/Nov/22 3:50 PM

Details

Description

Attachments

Issue Links

Forms

Activity

People

Dates