[SERVER-11302] Deadlock in invalidateUserCache Created: 22/Oct/13  Updated: 11/Jul/16  Resolved: 23/Oct/13

Status: Closed
Project: Core Server
Component/s: Security
Affects Version/s: 2.5.3
Fix Version/s: 2.5.4

Type: Bug Priority: Critical - P2
Reporter: Eric Milkie Assignee: Andy Schwerin
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Backwards Compatibility: Fully Compatible
Operating System: ALL
Steps To Reproduce:

Found by running the parallel test suite.

Participants:

 Description   

AuthorizationManager::logOp currently acquires the logical "fetch mode" lock in the authorization manager, in order to invalidate the cache of user auth information. However, this method is called in contexts where the global or admin database write lock might be held. AuthorizationManager::acquireUser acquires the "fetch mode" lock, and then locks the "admin" database for read when used inside of mongod. This is a classic deadlock, which manifests sometimes in the parallel.js test.

The only reason to hold the "fetch mode" lock while invalidating the auth manager's user cache is to preserve a total ordering over operations that update that cache. One doesn't want an invalidation of the cache to come in while an update is engaged in a disk fetch, and then leave a user object in the cache that is no longer up-to-date.

Proposed solution: keep a 64-bit cache generation counter, incremented on every cache invalidation, and guarded by the CacheGuard construct. If the generation counter changes between the time that an acquireUser or related operation enters fetch mode and the time that it exits fetch mode, the new user object may be pre-invalidated, and omitted from the user cache. That object can still be returned to the caller of acquireUser, with the understanding that it will shortly be abandoned when the caller checks the user's validity.

Stack traces from a sample deadlock, below.

Thread 1:

 	[External Code]	
 	mongod.exe!boost::this_thread::interruptible_wait(void * handle_to_wait_for, boost::detail::timeout target_time) Line 469	C++
 	mongod.exe!boost::detail::basic_condition_variable::do_wait<boost::unique_lock<boost::mutex> >(boost::unique_lock<boost::mutex> & lock, boost::detail::timeout wait_until) Line 218	C++
 	mongod.exe!mongo::AuthorizationManager::CacheGuard::synchronizeWithFetchPhase() Line 172	C++
>	mongod.exe!mongo::AuthorizationManager::logOp(const char * op, const char * ns, const mongo::BSONObj & o, mongo::BSONObj * o2, bool * b, bool fromMigrateUnused, const mongo::BSONObj * fullObjUnused) Line 821	C++
 	mongod.exe!mongo::logOp(const char * opstr, const char * ns, const mongo::BSONObj & obj, mongo::BSONObj * patt, bool * b, bool fromMigrate, const mongo::BSONObj * fullObj) Line 371	C++
 	mongod.exe!mongo::Command::execCommand(mongo::Command * c, mongo::Client & client, int queryOptions, const char * cmdns, mongo::BSONObj & cmdObj, mongo::BSONObjBuilder & result, bool fromRepl) Line 2019	C++
 	mongod.exe!mongo::_runCommands(const char * ns, mongo::BSONObj & _cmdobj, mongo::_BufBuilder<mongo::TrivialAllocator> & b, mongo::BSONObjBuilder & anObjBuilder, bool fromRepl, int queryOptions) Line 2083	C++
 	mongod.exe!mongo::runCommands(const char * ns, mongo::BSONObj & jsobj, mongo::CurOp & curop, mongo::_BufBuilder<mongo::TrivialAllocator> & b, mongo::BSONObjBuilder & anObjBuilder, bool fromRepl, int queryOptions) Line 69	C++
 	mongod.exe!mongo::runQuery(mongo::Message & m, mongo::QueryMessage & q, mongo::CurOp & curop, mongo::Message & result) Line 1064	C++
 	mongod.exe!mongo::receivedQuery(mongo::Client & c, mongo::DbResponse & dbresponse, mongo::Message & m) Line 280	C++
 	mongod.exe!mongo::assembleResponse(mongo::Message & m, mongo::DbResponse & dbresponse, const mongo::HostAndPort & remote) Line 443	C++
 	mongod.exe!mongo::MyMessageHandler::process(mongo::Message & m, mongo::AbstractMessagingPort * port, mongo::LastError * le) Line 199	C++
 	mongod.exe!mongo::PortMessageServer::handleIncomingMsg(void * arg) Line 210	C++
 	mongod.exe!boost::`anonymous namespace'::thread_start_function(void * param) Line 185	C++
 	[External Code]	

Thread 2:

 	[External Code]	
 	mongod.exe!boost::this_thread::interruptible_wait(void * handle_to_wait_for, boost::detail::timeout target_time) Line 469	C++
 	mongod.exe!boost::detail::basic_condition_variable::do_wait<boost::mutex>(boost::mutex & lock, boost::detail::timeout wait_until) Line 218	C++
 	mongod.exe!mongo::QLock::lock_r() Line 184	C++
 	mongod.exe!mongo::Lock::DBRead::lockDB(const std::basic_string<char,std::char_traits<char>,std::allocator<char> > & ns) Line 616	C++
 	mongod.exe!mongo::Lock::DBRead::DBRead(const mongo::StringData & ns) Line 633	C++
 	mongod.exe!mongo::Client::ReadContext::ReadContext(const std::basic_string<char,std::char_traits<char>,std::allocator<char> > & ns, const std::basic_string<char,std::char_traits<char>,std::allocator<char> > & path) Line 284	C++
 	mongod.exe!mongo::AuthzManagerExternalStateMongod::_findUser(const std::basic_string<char,std::char_traits<char>,std::allocator<char> > & usersNamespace, const mongo::BSONObj & query, mongo::BSONObj * result) Line 229	C++
 	mongod.exe!mongo::AuthzManagerExternalStateMongod::getUserDescription(const mongo::UserName & userName, mongo::BSONObj * result) Line 117	C++
 	mongod.exe!mongo::AuthorizationManager::getUserDescription(const mongo::UserName & userName, mongo::BSONObj * result) Line 433	C++
 	mongod.exe!mongo::AuthorizationManager::acquireUser(const mongo::UserName & userName, mongo::User * * acquiredUser) Line 465	C++
 	mongod.exe!mongo::CmdAuthenticate::_authenticateCR(const mongo::UserName & user, const mongo::BSONObj & cmdObj) Line 212	C++
 	mongod.exe!mongo::CmdAuthenticate::_authenticate(const std::basic_string<char,std::char_traits<char>,std::allocator<char> > & mechanism, const mongo::UserName & user, const mongo::BSONObj & cmdObj) Line 156	C++
 	mongod.exe!mongo::CmdAuthenticate::run(const std::basic_string<char,std::char_traits<char>,std::allocator<char> > & dbname, mongo::BSONObj & cmdObj, int __formal, std::basic_string<char,std::char_traits<char>,std::allocator<char> > & errmsg, mongo::BSONObjBuilder & result, bool fromRepl) Line 127	C++
 	mongod.exe!mongo::_execCommand(mongo::Command * c, const std::basic_string<char,std::char_traits<char>,std::allocator<char> > & dbname, mongo::BSONObj & cmdObj, int queryOptions, std::basic_string<char,std::char_traits<char>,std::allocator<char> > & errmsg, mongo::BSONObjBuilder & result, bool fromRepl) Line 1850	C++
 	mongod.exe!mongo::Command::execCommand(mongo::Command * c, mongo::Client & client, int queryOptions, const char * cmdns, mongo::BSONObj & cmdObj, mongo::BSONObjBuilder & result, bool fromRepl) Line 1986	C++
 	mongod.exe!mongo::_runCommands(const char * ns, mongo::BSONObj & _cmdobj, mongo::_BufBuilder<mongo::TrivialAllocator> & b, mongo::BSONObjBuilder & anObjBuilder, bool fromRepl, int queryOptions) Line 2083	C++
 	mongod.exe!mongo::runCommands(const char * ns, mongo::BSONObj & jsobj, mongo::CurOp & curop, mongo::_BufBuilder<mongo::TrivialAllocator> & b, mongo::BSONObjBuilder & anObjBuilder, bool fromRepl, int queryOptions) Line 69	C++
 	mongod.exe!mongo::runQuery(mongo::Message & m, mongo::QueryMessage & q, mongo::CurOp & curop, mongo::Message & result) Line 1064	C++
 	mongod.exe!mongo::receivedQuery(mongo::Client & c, mongo::DbResponse & dbresponse, mongo::Message & m) Line 280	C++
>	mongod.exe!mongo::assembleResponse(mongo::Message & m, mongo::DbResponse & dbresponse, const mongo::HostAndPort & remote) Line 443	C++
 	mongod.exe!mongo::MyMessageHandler::process(mongo::Message & m, mongo::AbstractMessagingPort * port, mongo::LastError * le) Line 199	C++
 	mongod.exe!mongo::PortMessageServer::handleIncomingMsg(void * arg) Line 210	C++
 	mongod.exe!boost::`anonymous namespace'::thread_start_function(void * param) Line 185	C++
 	[External Code]	



 Comments   
Comment by auto [ 23/Oct/13 ]

Author:

{u'username': u'andy10gen', u'name': u'Andy Schwerin', u'email': u'schwerin@10gen.com'}

Message: SERVER-11302 Do not do auth checks when operating on behalf of a DBDirectClient.

When the server delegates work on behalf of a user to DBDirectClient, the server
should already have validated the user's privileges to take whatever logical
action the server is performing. DBDirectClient itself inherently represents
actions taken by the server, and so it doesn't make sense to do access-control
checks there.

This change eliminates a deadlock risk in access control checks inside
DBDirectClient, and resolves UDR bugs stemming from a DBDirectClient performing
an unexpected access-control check, such as SERVER-11310.
Branch: master
https://github.com/mongodb/mongo/commit/2b16aca77cbd4486d780e21636b7043733e12765

Comment by auto [ 23/Oct/13 ]

Author:

{u'username': u'andy10gen', u'name': u'Andy Schwerin', u'email': u'schwerin@10gen.com'}

Message: SERVER-11302 Remove deadlock between admin database write lock and _fetchPhaseIsReady in AuthMgr.

This is achieved in two parts. First, this patch changes the approach to
invalidation of the user info cache in the AuthorizationManager. Rather than
wait for _fetchPhaseIsReady, this patch invalidates the patch immediately and
bumps a cache generation counter. Methods that enter the fetch phase in
preparation for updating the cache check that the generation numbers before and
after the fetch phase are equal. If they are not, the fetched user document is
marked invalid and not inserted into the cache.

Second, the patch eliminates a self deadlock caused by recursive acquisition of
the global write lock when a DBDirectClient is used to update user data via the
AuthzManagerExternalStateMongoD interface.
Branch: master
https://github.com/mongodb/mongo/commit/4896c73dc63d71a1b5bcf66a69ddc651fda09ffd

Generated at Thu Feb 08 03:25:25 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.