[SERVER-34726] Deadlock with locally stashed transaction resources during profiling Created: 27/Apr/18 Updated: 29/Oct/23 Resolved: 18/May/18 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Replication |
| Affects Version/s: | None |
| Fix Version/s: | 4.0.0-rc0 |
| Type: | Bug | Priority: | Critical - P2 |
| Reporter: | Ian Boros | Assignee: | Tess Avitabile (Inactive) |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||||||||||
| Operating System: | ALL | ||||||||||||||||
| Sprint: | Repl 2018-05-21, Repl 2018-06-04 | ||||||||||||||||
| Participants: | |||||||||||||||||
| Linked BF Score: | 63 | ||||||||||||||||
| Description |
|
If a dropDatabase is run concurrently with a query operation that's part of a transaction, a deadlock may occur if the query operation attempts to write profiling information. Consider the following situation: There are two threads, T1 and T2. T1 needs to have profiling enabled. Here's a timeline of what could happen: -T1 starts an operation as part of a transaction and acquires its first lock In profile() the OperationContext's transaction resources are stored in a temporary local variable, and the OperationContext is given a new RecoveryUnit and Locker. With the new Locker, profile() requests an IX lock on the system.profile collection. The new lock request is enqueued behind the dropDatabase's X lock and is not granted. This leaves us with the following lock cycle: The lock requested by profile() in T1 cannot be granted until the drop (from T2) has finished. The X lock requested by T2 cannot be granted until the locks in the transaction resources stored locally in T1 are released. The transaction resources stored in a local variable cannot be moved back to the OperationContext until profile() has completed. Since T1 is deadlocked and its transaction resources are not associated with the session, they cannot be freed (even by the transaction killer or session killer thread). james.wahlin and I believe that one way to fix this would be to stash the TxnResources back on the session instead of in a local variable when shouldDBProfile returns true. This way, the transaction killer or session killer would be able to free the TxnResources. |
| Comments |
| Comment by Githook User [ 18/May/18 ] |
|
Author: {'username': 'tessavitabile', 'name': 'Tess Avitabile', 'email': 'tess.avitabile@mongodb.com'}Message: |
| Comment by Tess Avitabile (Inactive) [ 18/May/18 ] |
|
alyson.cabral, for this ticket we will make lock acquisition for profiling transaction operations have 0-second timeout, just like all other lock acquisitions in transaction operations. This is because a hang in profiling would prevent the transaction from making progress and could lead to deadlock. If profiling fails, we log a warning but do not abort the transaction. I wanted to make you aware that some transaction operations may not be profiled in 4.0. We could do work in the future to ensure that profiling transaction operations always succeeds (for example, by not doing them in the same thread as the transaction operation). |
| Comment by Tess Avitabile (Inactive) [ 16/May/18 ] |
|
No, this would not affect operation logging. |
| Comment by Bruce Lucas (Inactive) [ 16/May/18 ] |
|
Does this affect only profiling, and not operation logging? Logging operations in progress on stepdown or shutdown (along with time taken so far) is important, but I think not profling an operation interrupted by shutdown or stepdown is fine - in fact I think arguably more correct as the operation hasn't completed so you don't know how long it would actually take if it had. |
| Comment by Tess Avitabile (Inactive) [ 16/May/18 ] |
|
bruce.lucas, as part of this work, we are planning to remove the UninterruptibleLockGuard from profiling. The UninterruptibleLockGuard was there to make sure that we still profile if we exceed maxTimeMS. As a replacement, we will temporarily suspend the maxTimeMS on the operation, so that we will make sure to profile even if the operation exceeded its maxTimeMS. But removing the UninterruptibleLockGuard would mean we would now fail to profile if a stepdown or shutdown occurred. Does this seem acceptable to you? |
| Comment by Spencer Brody (Inactive) [ 01/May/18 ] |
|
Oh hey, somehow I missed that line! Looks good then, thanks. |
| Comment by James Wahlin [ 01/May/18 ] |
|
Profiling is enabled in read_concern_snapshot_yielding.js on this line. I am not sure if we have a repro outside of the failure in BF-8848 (ian.boros - please confirm). |
| Comment by Spencer Brody (Inactive) [ 01/May/18 ] |
|
ian.boros, is there a repro for this? Did this fail in evergreen? The linked BF is on read_concern_snapshot_yielding.js, which doesn't have profiling enabled, so I don't see how it could trigger this issue? |
| Comment by Spencer Brody (Inactive) [ 27/Apr/18 ] |
|
I think another way to fix this would be to make sure that the lock acquisitions done as part of profiling follow has the same 0 second lock acquisition timeout that |
| Comment by Ian Boros [ 27/Apr/18 ] |
|
If the suggested approach is taken we should be sure not to set the Locker's thread ID to "unset" when putting the TxnResources back on the session. This will make the LockManager dump output more readable. We would also need to add checks after profile() is called to see if the TxnResources are gone. In this case, we would have to return an error to the user, indicating that the transaction had been aborted. |