[SERVER-54510] Bind ClientStrand before destroying its opCtx Created: 12/Feb/21  Updated: 29/Oct/23  Resolved: 22/Jun/21

Status: Closed
Project: Core Server
Component/s: Internal Code
Affects Version/s: None
Fix Version/s: 5.0.0-rc4, 5.1.0-rc0

Type: Bug Priority: Major - P3
Reporter: Amirsaman Memaripour Assignee: George Wangensteen
Resolution: Fixed Votes: 0
Labels: post-rc0, servicearch-wfbf-day
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Backports
Depends
Related
related to SERVER-53993 Attach client strand before releasing... Closed
is related to SERVER-50141 ServiceStateMachineTest needs to be r... Closed
is related to SERVER-51015 Rewrite ServiceExecutor interface to ... Closed
Backwards Compatibility: Fully Compatible
Operating System: ALL
Backport Requested:
v5.0
Sprint: Service Arch 2021-06-14, Service Arch 2021-06-28
Participants:
Linked BF Score: 144
Story Points: 4

 Description   

We create a new opCtx to serve each client request (see here). This opCtx is later destroyed in the tail of a future-continuation constructed in ServiceStateMachine::Impl::startNewLoop (see here). The destruction of opCtx, however, may run on a thread other than the one that has the Client object attached, causing read-after-free race conditions.

Future<void> ServiceStateMachine::Impl::processMessage() {
    ...
    _opCtx = Client::getCurrent()->makeOperationContext();
    ...
}

.getAsync([this](Status status) {
    // We may or may not have an operation context, but it should definitely be gone now.
    _opCtx.reset();
    ...
});

Similarly, the opCtx used to run a command is killed after returning from handleRequest (see here). This may also run on a thread other than the one holding the client lock, resulting in data races. We should ensure the operation is only killed on a thread that owns its corresponding client object.

Acceptance criteria: ensure the opCtx is always killed/destructed while having its Client attached to the current thread (e.g., by binding the ClientStrand). The performance impact of the changes must also be evaluated (using sys-perf), as the change is on the critical path of command execution.



 Comments   
Comment by Vivian Ge (Inactive) [ 06/Oct/21 ]

Updating the fixversion since branching activities occurred yesterday. This ticket will be in rc0 when it’s been triggered. For more active release information, please keep an eye on #server-release. Thank you!

Comment by Githook User [ 23/Jun/21 ]

Author:

{'name': 'George Wangensteen', 'email': 'george.wangensteen@mongodb.com', 'username': 'gewa24'}

Message: SERVER-54510 Ensure client is bound to thread destroying opCtx in transport tests

(cherry picked from commit 89ed94739e87e5378f663c196a16790feb95abb3)
Branch: v5.0
https://github.com/mongodb/mongo/commit/0a1b34e2da06434d3fe7874575356c5be5cabaf9

Comment by Githook User [ 21/Jun/21 ]

Author:

{'name': 'George Wangensteen', 'email': 'george.wangensteen@mongodb.com', 'username': 'gewa24'}

Message: SERVER-54510 Ensure client is bound to thread destroying opCtx in transport tests
Branch: master
https://github.com/mongodb/mongo/commit/89ed94739e87e5378f663c196a16790feb95abb3

Comment by Benjamin Caimano (Inactive) [ 16/Feb/21 ]

This might be a bit tough to phrase since it runs into the classic async locking problem. I can see two options for how to do it:

  • Make ClientStrand::bind() increment a counter here to achieve reentrancy. While we normally look down on reentrancy, however we already are dealing with thread locals.
  • We can schedule the release onto the ServiceExecutor, but this might run into the ServiceExecutorSynchronous out-of-line scheduling issue. It might be time to do SERVER-51015 if this is the option we want to pursue.
Generated at Thu Feb 08 05:33:45 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.