[SERVER-37261] CurrentOp shouldn't read OpCtx decorations Created: 21/Sep/18 Updated: 08/Jan/24 Resolved: 08/Nov/18 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Replication |
| Affects Version/s: | None |
| Fix Version/s: | 4.1.6 |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Spencer Brody (Inactive) | Assignee: | Vesselina Ratcheva (Inactive) |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||||||
| Operating System: | ALL | ||||||||||||
| Sprint: | Repl 2018-10-22, Repl 2018-11-05, Repl 2018-11-19 | ||||||||||||
| Participants: | |||||||||||||
| Linked BF Score: | 2 | ||||||||||||
| Description |
|
The currentOp command loops over all Clients, and for any that have an associated OpCtx and OperationContextSession it reads the ReadConcernArgs decoration off the OpCtx. This isn't valid as there's no synchronization around writing to the decoration other than that it can only happen from the thread running the OpCtx. |
| Comments |
| Comment by Vesselina Ratcheva (Inactive) [ 08/Nov/18 ] |
|
Closing this, as it's completed. Feel free to file additional ticket(s) as you see fit. |
| Comment by Githook User [ 08/Nov/18 ] |
|
Author: {'name': 'Vesselina Ratcheva', 'email': 'vesselina.ratcheva@10gen.com', 'username': 'vessy-mongodb'}Message: |
| Comment by Tess Avitabile (Inactive) [ 07/Nov/18 ] |
|
Not at all, I think it's a good idea to think about how we can handle the concurrency of decorations better. |
| Comment by Kaloian Manassiev [ 07/Nov/18 ] |
|
Yes, yes - I didn't mean to derail this change now (already approved the CR). |
| Comment by Tess Avitabile (Inactive) [ 07/Nov/18 ] |
|
kaloian.manassiev, would it be alright with you if vesselina.ratcheva pushes her changes to take the client lock when setting the ReadConcernArgs decoration, in order to fix the failure? We can file a ticket with the proposed refactor. |
| Comment by Andy Schwerin [ 07/Nov/18 ] |
|
Good point, Tess. |
| Comment by Tess Avitabile (Inactive) [ 07/Nov/18 ] |
|
It would be preferable if the thread that owns the OperationContext does not need to hold the Client lock in order to read the decorations--it should only need the Client lock in order to modify them. |
| Comment by Andy Schwerin [ 07/Nov/18 ] |
|
The getter would only accept unique_lock<>, but yeah, I was thinking On Wed, Nov 7, 2018, 10:25 AM Kaloian Manassiev (Jira) <jira@mongodb.org> |
| Comment by Kaloian Manassiev [ 07/Nov/18 ] |
Actually I like this idea. So perhaps decorations that need to be synchronized on the client would derive from a separate "SynchronizedDecoration" class, whose get method also accepts a unique_lock<T>& of some parent type. Is this what you meant? |
| Comment by Andy Schwerin [ 07/Nov/18 ] |
|
I don’t think it makes sense for all decorations to have internal concurrency control and its incumbent overhead when only a few are accessed off-thread. Even if the overhead were free, I don’t think it’s wise to assume that separately synchronizing a bunch of individual decorations will lead to correct behavior - some decorations may change in concert with each other, e.g. I could imagine having the off-thread decorations somehow kept separately and in a way where they could only be accessed by threads that held the Client lock, though. The decorations that were intended to be accessed in that way would be accessed through getters that took const unique_lock<Client>& or similar instead of Client*, thus enforcing that the caller held the Client lock when accessing them. |
| Comment by Kaloian Manassiev [ 07/Nov/18 ] |
|
So I looked at the code review for this and while it is correct as implemented, using the client lock to protect access/modifications to the decorations of OperationContext is not a very future-proof way to solve this. There are probably 50 decorations on the OpCtx now and we managed to catch this one, but who knows what the issues are with the other. In my opinion, it should be the responsibility of the decorations to have internal concurrency control of their own if they can be read from outside the OperationContext's thread and the meaning of the client mutex should be to just ensure that the Client and the OperationContext object won't disappear from underneath whoever is reading it. Does it make sense to bake this mechanism in the Decoration class itself (or at least the one for OperationContext), where the get method returns a locked ScopedObject (with the correct operators so we don't have to retype every use)? |
| Comment by Siyuan Zhou [ 31/Oct/18 ] |
|
Even if we cache the read concern in TransactionParticipant, the access to it still has to be synchronized, maybe under the transaction participant's mutex. The client's mutex is designed to protect the concurrent access of operation context, so it's very reasonable to follow this pattern unless it doesn't work. |
| Comment by Siyuan Zhou [ 30/Oct/18 ] |
|
Discussed with mira.carey@mongodb.com, we already acquire the client lock when reporting currentOp. We should protect the two places where we set the read concern in service_entry_point_common.cpp with the client lock. We probably need to change the title too. |
| Comment by William Schultz (Inactive) [ 21/Sep/18 ] |
|
As per my comment on BF-10109, for this specific case it may be simpler to just store the ReadConcernArgs on the Session/TransactionParticipant object and keep it there, so we can always have access to them, instead of worrying about them going back and forth between the OperationContext and the Session/TransactionParticipant, on every stash/unstash event. That way we wouldn't have to worry about modifying the locking rules for the ReadConcernArgs decoration. |
| Comment by Andy Schwerin [ 21/Sep/18 ] |
|
If we want to read the read concern, we could start acquiring the client lock before setting it. I don’t know where we’d document that rule, but it’s the standard way we deal with data that’s read from currentOp. |