[SERVER-59673] Investigate better solutions for fixing the deadlock issue in profiling Created: 30/Aug/21  Updated: 29/Oct/23  Resolved: 02/Dec/22

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: 6.3.0-rc0

Type: Improvement Priority: Major - P3
Reporter: Wenbin Zhu Assignee: Jordi Olivares Provencio
Resolution: Fixed Votes: 0
Labels: techdebt
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
is depended on by SERVER-71536 Investigate if non-replicated collect... Backlog
Problem/Incident
Related
related to SERVER-63143 Operation can be interrupted by maxTi... Closed
related to SERVER-71786 Move profiling writes into a profile ... Backlog
is related to SERVER-59226 Deadlock when stepping down with a pr... Closed
is related to SERVER-71440 Remove OpCtx::setIgnoreInterruptsExce... Closed
Backwards Compatibility: Fully Compatible
Sprint: Execution Team 2022-05-02, Execution Team 2022-05-16, Execution Team 2022-10-03, Execution Team 2022-10-17, Execution Team 2022-10-31, Execution Team 2022-11-14, Execution Team 2022-12-12, Execution Team 2022-11-28
Participants:
Linked BF Score: 145

 Description   

SERVER-59226 discovered a deadlock issue between the profiling operation and the replication state change. This ticket is created to investigate other proposed solutions that are more generic and Execution involved to remove the layer violation introduced in the original fix. One of the proposed solution is to avoid acquiring RSTL lock for profile(), or to be more generic, for all non-replicated writes. We should first investigate if RSTL acquisition can be safely removed for non-replicated writes and figure out how to remove it, probably by using a new RAII type. Another solution is to do profile in a separate thread or a dedicated thread pool. This approach may need async work queues and we need to investigate what to do when there are more profile requests than what can be fit into the queue. Also as a follow up, we might want to examine UninterruptibleLockGuard use cases and see if that can be improved since we recently saw and increasing number of issues because of that.  



 Comments   
Comment by Githook User [ 02/Dec/22 ]

Author:

{'name': 'Jordi Olivares Provencio', 'email': 'jordi.olivares-provencio@mongodb.com', 'username': 'jordiolivares'}

Message: SERVER-59673 Exclude profiling from taking RSTL lock
Branch: master
https://github.com/mongodb/mongo/commit/8debef920f2151e833d6ba524c13c4c7b89eeebf

Comment by Jordi Olivares Provencio [ 21/Sep/22 ]

One option that is quite simple to implement is to define a list of collections which we know are not concerned by the ReplicaSet state changes and to skip the RSTL lock for them. Inside AutoGetCollection we can easily check for membership. Initially we would include the system.profile collection but it could be easily expanded in the future.

This option also serves as documentation for which collections are used in replication.

Comment by Samyukta Lanka [ 20/Sep/21 ]

Even if we can't remove UninterruptibleLockGuard in the near future, we might want to audit its current uses to see if a similar bug is possible with other operations.

Comment by Wenbin Zhu [ 20/Sep/21 ]

Probably not part of this ticket, but maybe also consider filing another ticket to completely remove UninterruptibleLockGuard

Comment by Andy Schwerin [ 14/Sep/21 ]

We don't know ahead of time which operations we will record the profile
information for, and we don't want any lock acquisition costs for the
profiling to show up in the profile. The times we care are exactly when
something is taking long that shouldn't or there are conflicts that are
unexpected.

On Tue, Sep 14, 2021 at 4:24 PM Louis Williams (Jira) <jira@mongodb.org>

Comment by Louis Williams [ 14/Sep/21 ]

In this case, operations should know that they need to profile ahead of time, and they can acquire all the locks that they need up-front. There should be no consequence of holding a lock on the profile collection for an extended period of time, because there should be no conflicting operations on the profile collection, especially after SERVER-55034.

Comment by Andy Schwerin [ 14/Sep/21 ]

I agree that pattern exists, but I don't think this quite matches the pattern, louis.williams. The cleanup tasks here actually require a completely separate set of locks from the "do something". The "cleanup task" is writing the profile result, which is intentionally considered separate work from the operation itself.

Comment by Gregory Noma [ 14/Sep/21 ]

louis.williams just spitballing: maybe something like registering a callback when taking a lock which will run before the lock is released in those scenarios?

Comment by Louis Williams [ 14/Sep/21 ]

This is just one example of a common, yet problematic pattern that we have throughout the server:

  • An operation acquires locks
  • The operation is interrupted/times out/conflicts with a state transition. An exception propagates such that the locks are released, but this operation needs to perform some cleanup task that requires a lock.
  • This lock is no longer available because the operation has been interrupted (i.e. cannot acquire more locks) or there is a conflicting operation (i.e. stepDown).

The solutions currently are: 1) block until the lock is available and ignoring interrupts (i.e. UninterruptibleLockGuard) or 2) skip the cleanup, which could leak resources. In the case of this ticket, that would mean not profile anything.

We need this pattern more often:

  • Acquire locks
  • Do something
  • Perform cleanup with locks
  • Release locks
Generated at Thu Feb 08 05:47:49 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.