[SERVER-35217] killSessions command attempts to kill a transaction while holding SessionCatalog::_mutex, which leads to deadlock Created: 25/May/18  Updated: 29/Oct/23  Resolved: 08/Jun/18

Status: Closed
Project: Core Server
Component/s: Concurrency, Replication
Affects Version/s: 4.0.0-rc0
Fix Version/s: 4.0.0-rc5, 4.1.1

Type: Bug Priority: Critical - P2
Reporter: Max Hirschhorn Assignee: Tess Avitabile (Inactive)
Resolution: Fixed Votes: 0
Labels: bkp, disabled-test, todo_in_code
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Backports
Related
related to SERVER-35770 Running a multi-statement transaction... Closed
related to SERVER-42550 Complete TODO listed in SERVER-35217 Closed
is related to SERVER-34781 Abandoned cursor in a transaction can... Closed
is related to SERVER-34732 collection drop hangs in test of tran... Closed
is related to SERVER-34795 killSessions should kill transactions... Closed
is related to SERVER-34779 Check the dbhash periodically in a ne... Closed
Backwards Compatibility: Fully Compatible
Operating System: ALL
Backport Requested:
v4.0
Sprint: Repl 2018-06-18
Participants:

 Description   

Attempting to acquire a LockManager lock while holding a mutex is prone to deadlock. The SessionCatalog::scanSessions() function locks the session catalog and executes an arbitrary function on any matching sessions. For the "killSessions" command, this means that while holding the SessionCatalog::_mutex, it calls CursorManager::killAllCursorsForTransaction(), which for a find cursor attempts to acquire the collection lock in CursorManager::withCursorManager(). Since the killAllExpiredTransactions() function also calls SessionCatalog::scanSessions(), it isn't possible to reap expired transactions while the "killSessions" command is running. It similarly isn't possible to commit or abort a transaction while the "killSessions" command is running because SessionCatalog::checkOutSession() must acquire the SessionCatalog::_mutex.



 Comments   
Comment by Githook User [ 08/Jun/18 ]

Author:

{'username': 'tessavitabile', 'name': 'Tess Avitabile', 'email': 'tess.avitabile@mongodb.com'}

Message: SERVER-35217 Do not kill associated client cursors when transaction ends

(cherry picked from commit c0273c0a89e73ecfd7848fd4906a8e3c2d5886b9)
Branch: v4.0
https://github.com/mongodb/mongo/commit/ad984b21834dfd2c46ad1d32882d6461f20e9d08

Comment by Githook User [ 08/Jun/18 ]

Author:

{'username': 'tessavitabile', 'name': 'Tess Avitabile', 'email': 'tess.avitabile@mongodb.com'}

Message: SERVER-35217 Do not kill associated client cursors when transaction ends
Branch: master
https://github.com/mongodb/mongo/commit/c0273c0a89e73ecfd7848fd4906a8e3c2d5886b9

Comment by Tess Avitabile (Inactive) [ 06/Jun/18 ]

To address this issue, we will no longer kill associated client cursors when a transaction aborts or commits. That is, we will revert the work done in SERVER-33690 to kill client cursors. There is no need to kill client cursors when a transaction aborts or commits because drivers issue a killCursors command when a cursor is closed or goes out of scope. Drivers perform this cleanup regardless of whether the cursor was opened as part of a transaction. The killCursors command does not need to sent with the same lsid or txnNumber as the command that created the cursor. If killCursors is not sent, we rely on our usual fallbacks of cursor timeout and session expiration to clean up the client cursor. There is no risk of using a cursor after the transaction ends, since cursors may only be iterated in the transaction in which they were created (SERVER-33367).

The decision was made with david.storch, shane.harvey, and jesse.

Generated at Thu Feb 08 04:39:11 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.