Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-21401

db.currentOp(true) can cause use-after-free if a client connection is closed simultaneously

    • Type: Icon: Bug Bug
    • Resolution: Done
    • Priority: Icon: Major - P3 Major - P3
    • 3.2.0-rc3
    • Affects Version/s: None
    • Component/s: Diagnostics
    • None
    • Fully Compatible
    • ALL
    • QuInt C (11/23/15)

      There is a race on ClientBasic::_messagingPort that can be reproduced by running the currentOp command with the "$all: true" option at the same time that a client connection is closed.

      • When a client connection is closed, the associated server thread deletes the MessagingPort object that it owns, in handleIncomingMsg() (message_server_port.cpp). This deletion is performed outside of the lock on the Client object.
      • When the currentOp command is run, it calls Client::getRemote() on each client in the global clients list. Client::getRemote() delegates its implementation to AbstractMessagingPort::remote(), on the AbstractMessagingPort object that the Client keeps a reference to.
      • Client::getRemote() will cause a use-after-free if the MessagingPort associated with the client has already been deleted.

      This is a regression introduced in 3.1.4 by SERVER-18482 (d6e68685). It affects development releases 3.1.4 through 3.2.0-rc2, and does not affect any production releases.

      Reproduce with the following script:

      startParallelShell("while (true) { db.currentOp(true); }");
      while (true) { new Mongo(); gc(); }
      

      See the following gdb session:

      2015-11-10T17:05:08.579-0500 I NETWORK  [conn16] end connection 127.0.0.1:35065 (2 connections now open)
      2015-11-10T17:05:08.580-0500 I NETWORK  [initandlisten] connection accepted from 127.0.0.1:35066 #17 (3 connections now open)
      
      Program received signal SIGSEGV, Segmentation fault.
      [Switching to Thread 0x7fffb955d700 (LWP 18093)]
      0x00007ffff4c792a5 in mongo::ClientBasic::getRemote (this=0x7fffb97c9730)
          at src/mongo/db/client_basic.h:65
      65	        return _messagingPort->remote();
      (gdb) bt
      #0  0x00007ffff4c792a5 in mongo::ClientBasic::getRemote (this=0x7fffb97c9730)
          at src/mongo/db/client_basic.h:65
      #1  0x00007fffedb0c79e in mongo::Client::reportState (this=0x7fffb97c9730, builder=...)
          at src/mongo/db/client.cpp:120
      #2  0x00007fffee008d6e in mongo::CurrentOpCommand::run (
          this=0x7fffee482a98 <mongo::currentOpCommand>, txn=0x7fffcdcd7600, db=..., cmdObj=...,
          options=0, errmsg=..., result=...) at src/mongo/db/commands/current_op.cpp:127
      #3  0x00007fffee0a688a in mongo::Command::run (this=0x7fffee482a98 <mongo::currentOpCommand>,
          txn=0x7fffcdcd7600, request=..., replyBuilder=0x7fffb955b4e0)
          at src/mongo/db/dbcommands.cpp:1408
      #4  0x00007fffee0a58ba in mongo::Command::execCommand (txn=0x7fffcdcd7600,
          command=0x7fffee482a98 <mongo::currentOpCommand>, request=..., replyBuilder=0x7fffb955b4e0)
          at src/mongo/db/dbcommands.cpp:1313
      #5  0x00007ffff0ace770 in mongo::runCommands (txn=0x7fffcdcd7600, request=...,
          replyBuilder=0x7fffb955b4e0) at src/mongo/db/commands.cpp:498
      #6  0x00007fffee0e8997 in mongo::(anonymous namespace)::receivedRpc (txn=0x7fffcdcd7600,
          client=..., dbResponse=..., message=...) at src/mongo/db/instance.cpp:293
      #7  0x00007fffee0e5dfd in mongo::assembleResponse (txn=0x7fffcdcd7600, m=..., dbresponse=...,
          remote=...) at src/mongo/db/instance.cpp:522
      #8  0x0000000000434013 in mongo::MyMessageHandler::process (this=0x7fffcdcdd2e0, m=..., port=
          0x7fffcd3e5ea0) at src/mongo/db/db.cpp:170
      #9  0x00007fffd668f6f5 in mongo::PortMessageServer::handleIncomingMsg (arg=0x7fffcd3e5ea0)
          at src/mongo/util/net/message_server_port.cpp:229
      #10 0x00007fffcec39182 in start_thread (arg=0x7fffb955d700) at pthread_create.c:312
      #11 0x00007fffce96630d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111
      (gdb) info threads
      [New Thread 0x7fffb94dc700 (LWP 18099)]
        Id   Target Id         Frame
        30   Thread 0x7fffb94dc700 (LWP 18099) "mongod" clone ()
          at ../sysdeps/unix/sysv/linux/x86_64/clone.S:81
        29   Thread 0x7fffb95de700 (LWP 18097) "mongod" __lll_lock_wait ()
          at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135
      * 27   Thread 0x7fffb955d700 (LWP 18093) "mongod" 0x00007ffff4c792a5 in mongo::ClientBasic::getRemote (this=0x7fffb97c9730) at src/mongo/db/client_basic.h:65
        14   Thread 0x7fffb965f700 (LWP 18027) "mongod" 0x00007fffcec407eb in __libc_recv (fd=13,
          buf=0x7fffb965e588, n=16, flags=-1) at ../sysdeps/unix/sysv/linux/x86_64/recv.c:33
        13   Thread 0x7fffbc270700 (LWP 17986) "mongod" pthread_cond_timedwait@@GLIBC_2.3.2 ()
          at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:238
        12   Thread 0x7fffbca71700 (LWP 17985) "mongod" 0x00007fffcec40b9d in nanosleep ()
          at ../sysdeps/unix/syscall-template.S:81
        11   Thread 0x7fffbd272700 (LWP 17984) "mongod" pthread_cond_timedwait@@GLIBC_2.3.2 ()
          at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:238
        10   Thread 0x7fffbda73700 (LWP 17983) "mongod" 0x00007fffcec40b9d in nanosleep ()
          at ../sysdeps/unix/syscall-template.S:81
        9    Thread 0x7fffbe274700 (LWP 17982) "mongod" 0x00007fffcec40b9d in nanosleep ()
          at ../sysdeps/unix/syscall-template.S:81
        8    Thread 0x7fffbea75700 (LWP 17981) "mongod" pthread_cond_timedwait@@GLIBC_2.3.2 ()
          at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:238
        7    Thread 0x7fffc9276700 (LWP 17980) "mongod" pthread_cond_wait@@GLIBC_2.3.2 ()
          at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
        6    Thread 0x7fffc9a77700 (LWP 17979) "mongod" pthread_cond_wait@@GLIBC_2.3.2 ()
          at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
        5    Thread 0x7fffca67d700 (LWP 17978) "mongod" pthread_cond_timedwait@@GLIBC_2.3.2 ()
          at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:238
        4    Thread 0x7fffcae7e700 (LWP 17976) "mongod" 0x00007fffcec40b9d in nanosleep ()
          at ../sysdeps/unix/syscall-template.S:81
        3    Thread 0x7fffcb67f700 (LWP 17975) "mongod" pthread_cond_wait@@GLIBC_2.3.2 ()
          at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
        2    Thread 0x7fffcd287700 (LWP 17974) "mongod" 0x00007fffcec410d1 in do_sigwait (
          sig=0x7fffcd2868cc, set=<optimized out>)
          at ../nptl/sysdeps/unix/sysv/linux/../../../../../sysdeps/unix/sysv/linux/sigwait.c:60
        1    Thread 0x7ffff7f17d00 (LWP 17967) "mongod" clone ()
          at ../sysdeps/unix/sysv/linux/x86_64/clone.S:81
      (gdb) thread 29
      [Switching to thread 29 (Thread 0x7fffb95de700 (LWP 18097))]
      #0  __lll_lock_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135
      135	../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S: No such file or directory.
      (gdb) bt
      #0  __lll_lock_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135
      #1  0x00007fffcec3b657 in _L_lock_909 () from /lib/x86_64-linux-gnu/libpthread.so.0
      #2  0x00007fffcec3b480 in __GI___pthread_mutex_lock (mutex=0x7ffff7e6a538)
          at ../nptl/pthread_mutex_lock.c:79
      #3  0x00007ffff72a7085 in __gthread_mutex_lock (__mutex=0x7ffff7e6a538)
          at /usr/bin/../lib/gcc/x86_64-linux-gnu/4.8/../../../../include/x86_64-linux-gnu/c++/4.8/bits/gthr-default.h:748
      #4  0x00007ffff72a86c5 in std::mutex::lock (this=0x7ffff7e6a538)
          at /usr/bin/../lib/gcc/x86_64-linux-gnu/4.8/../../../../include/c++/4.8/mutex:134
      #5  0x00007ffff72a7623 in std::lock_guard<std::mutex>::lock_guard (this=0x7fffb95dd810, __m=...)
          at /usr/bin/../lib/gcc/x86_64-linux-gnu/4.8/../../../../include/c++/4.8/mutex:414
      #6  0x00007fffedb11109 in mongo::ServiceContext::ClientDeleter::operator() (this=0x7fffcdcdb3e0,
          client=0x7fffb97c9730) at src/mongo/db/service_context.cpp:165
      #7  0x00007fffedb0d016 in std::unique_ptr<mongo::Client, mongo::ServiceContext::ClientDeleter>::~unique_ptr (this=0x7fffcdcdb3e0)
          at /usr/bin/../lib/gcc/x86_64-linux-gnu/4.8/../../../../include/c++/4.8/bits/unique_ptr.h:184
      #8  0x00007fffedb0f9ee in boost::thread_specific_ptr<std::unique_ptr<mongo::Client, mongo::ServiceContext::ClientDeleter> >::delete_data::operator() (this=0x7ffff7e50d60, data=0x7fffcdcdb3e0)
          at src/third_party/boost-1.56.0/boost/thread/tss.hpp:42
      #9  0x00007fffd4b70cd7 in boost::detail::(anonymous namespace)::tls_destructor (data=0x7fffcd3e03e0)
          at src/third_party/boost-1.56.0/libs/thread/src/pthread/thread.cpp:110
      #10 0x00007fffcec38f82 in __nptl_deallocate_tsd () at pthread_create.c:158
      #11 0x00007fffcec39195 in start_thread (arg=0x7fffb95de700) at pthread_create.c:325
      #12 0x00007fffce96630d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111
      (gdb)
      

      And, see relevant output from valgrind --tool=helgrind:

      ==20184== Lock at 0x2E1CF148 was first observed
      ==20184==    at 0x4C32145: pthread_mutex_lock (in /usr/lib/valgrind/vgpreload_helgrind-amd64-linux.so)
      ==20184==    by 0x574F044: __gthread_mutex_lock(pthread_mutex_t*) (gthr-default.h:748)
      ==20184==    by 0x5750684: std::mutex::lock() (mutex:134)
      ==20184==    by 0x574F5E2: std::lock_guard<std::mutex>::lock_guard(std::mutex&) (mutex:414)
      ==20184==    by 0xEBC30A5: mongo::ServiceContextMongoD::registerKillOpListener(mongo::KillOpListenerInterface*) (service_context_d.cpp:287)
      ==20184==    by 0x42464E: _mongoInitializerFunction_CreateReplicationManager(mongo::InitializerContext*) (db.cpp:797)
      ==20184==    by 0x429783: std::_Function_handler<mongo::Status (mongo::InitializerContext*), mongo::Status (*)(mongo::InitializerContext*)>::_M_invoke(std::_Any_data const&, mongo::InitializerContext*) (functional:2056)
      ==20184==    by 0x1D6E6789: std::function<mongo::Status (mongo::InitializerContext*)>::operator()(mongo::InitializerContext*) const (functional:2464)
      ==20184==    by 0x1D6E5D6F: mongo::Initializer::execute(std::vector<std::string, std::allocator<std::string> > const&, std::map<std::string, std::string, std::less<std::string>, std::allocator<std::pair<std::string const, std::string> > > const&) const (initializer.cpp:57)
      ==20184==    by 0x1D6E5FA3: mongo::runGlobalInitializers(std::vector<std::string, std::allocator<std::string> > const&, std::map<std::string, std::string, std::less<std::string>, std::allocator<std::pair<std::string const, std::string> > > const&) (initializer.cpp:70)
      ==20184==    by 0x1D6E6345: mongo::runGlobalInitializers(int, char const* const*, char const* const*) (initializer.cpp:89)
      ==20184==    by 0x424197: mongoDbMain(int, char**, char**) (db.cpp:850)
      ==20184==
      ==20184== Lock at 0x35545A88 was first observed
      ==20184==    at 0x4C2F795: ??? (in /usr/lib/valgrind/vgpreload_helgrind-amd64-linux.so)
      ==20184==    by 0x238619D9: mongo::SpinLock::SpinLock() (spin_lock.cpp:56)
      ==20184==    by 0xEF1544D: mongo::Client::Client(std::string, mongo::ServiceContext*, mongo::AbstractMessagingPort*) (client.cpp:105)
      ==20184==    by 0xEF19C2F: mongo::ServiceContext::makeClient(std::string, mongo::AbstractMessagingPort*) (service_context.cpp:122)
      ==20184==    by 0xEF152FD: mongo::Client::initThread(char const*, mongo::ServiceContext*, mongo::AbstractMessagingPort*) (client.cpp:89)
      ==20184==    by 0xEF150BC: mongo::Client::initThread(char const*, mongo::AbstractMessagingPort*) (client.cpp:69)
      ==20184==    by 0x433E12: mongo::MyMessageHandler::connected(mongo::AbstractMessagingPort*) (db.cpp:157)
      ==20184==    by 0x2616EFD6: mongo::PortMessageServer::handleIncomingMsg(void*) (message_server_port.cpp:212)
      ==20184==    by 0x4C30E26: ??? (in /usr/lib/valgrind/vgpreload_helgrind-amd64-linux.so)
      ==20184==    by 0x2D939181: start_thread (pthread_create.c:312)
      ==20184==    by 0x2DC4A30C: clone (clone.S:111)
      ==20184==
      ==20184== Possible data race during write of size 8 at 0x31C9BA40 by thread #99
      ==20184== Locks held: none
      ==20184==    at 0x265C6F3A: mongo::MessagingPort::~MessagingPort() (message_port.cpp:129)
      ==20184==    by 0x2616E0C4: mongo::(anonymous namespace)::MessagingPortWithHandler::~MessagingPortWithHandler() (message_server_port.cpp:73)
      ==20184==    by 0x2616E0E8: mongo::(anonymous namespace)::MessagingPortWithHandler::~MessagingPortWithHandler() (message_server_port.cpp:73)
      ==20184==    by 0x2616DCF1: std::default_delete<mongo::(anonymous namespace)::MessagingPortWithHandler>::operator()(mongo::(anonymous namespace)::MessagingPortWithHandler*) const (unique_ptr.h:67)
      ==20184==    by 0x2616DC55: std::unique_ptr<mongo::(anonymous namespace)::MessagingPortWithHandler, std::default_delete<mongo::(anonymous namespace)::MessagingPortWithHandler> >::~unique_ptr() (unique_ptr.h:184)
      ==20184==    by 0x2616F39F: mongo::PortMessageServer::handleIncomingMsg(void*) (message_server_port.cpp:261)
      ==20184==    by 0x4C30E26: ??? (in /usr/lib/valgrind/vgpreload_helgrind-amd64-linux.so)
      ==20184==    by 0x2D939181: start_thread (pthread_create.c:312)
      ==20184==    by 0x2DC4A30C: clone (clone.S:111)
      ==20184==
      ==20184== This conflicts with a previous read of size 8 by thread #97
      ==20184== Locks held: 2, at addresses 0x2E1CF148 0x35545A88
      ==20184==    at 0x7DA025B: mongo::ClientBasic::getRemote() const (client_basic.h:65)
      ==20184==    by 0xEF1575D: mongo::Client::reportState(mongo::BSONObjBuilder&) (client.cpp:120)
      ==20184==    by 0xEA6CD2D: mongo::CurrentOpCommand::run(mongo::OperationContext*, std::string const&, mongo::BSONObj&, int, std::string&, mongo::BSONObjBuilder&) (current_op.cpp:127)
      ==20184==    by 0xEB0A849: mongo::Command::run(mongo::OperationContext*, mongo::rpc::RequestInterface const&, mongo::rpc::ReplyBuilderInterface*) (dbcommands.cpp:1408)
      ==20184==    by 0xEB09879: mongo::Command::execCommand(mongo::OperationContext*, mongo::Command*, mongo::rpc::RequestInterface const&, mongo::rpc::ReplyBuilderInterface*) (dbcommands.cpp:1313)
      ==20184==    by 0xBF5072F: mongo::runCommands(mongo::OperationContext*, mongo::rpc::RequestInterface const&, mongo::rpc::ReplyBuilderInterface*) (commands.cpp:498)
      ==20184==    by 0xEB4C956: mongo::(anonymous namespace)::receivedRpc(mongo::OperationContext*, mongo::Client&, mongo::DbResponse&, mongo::Message&) (instance.cpp:293)
      ==20184==    by 0xEB49DBC: mongo::assembleResponse(mongo::OperationContext*, mongo::Message&, mongo::DbResponse&, mongo::HostAndPort const&) (instance.cpp:522)
      ==20184==
      ==20184== Address 0x31C9BA40 is 0 bytes inside a block of size 72 alloc'd
      ==20184==    at 0x4C2C460: operator new(unsigned long) (in /usr/lib/valgrind/vgpreload_helgrind-amd64-linux.so)
      ==20184==    by 0x2616E529: mongo::PortMessageServer::accepted(std::shared_ptr<mongo::Socket>, long long) (message_server_port.cpp:109)
      ==20184==    by 0x2616EBEE: non-virtual thunk to mongo::PortMessageServer::accepted(std::shared_ptr<mongo::Socket>, long long) (message_server_port.cpp:165)
      ==20184==    by 0x265BFCFA: mongo::Listener::initAndListen() (listen.cpp:352)
      ==20184==    by 0x2616E44B: mongo::PortMessageServer::run() (message_server_port.cpp:176)
      ==20184==    by 0x423EC2: mongo::_initAndListen(int) (db.cpp:635)
      ==20184==    by 0x421B02: mongo::initAndListen(int) (db.cpp:640)
      ==20184==    by 0x424385: mongoDbMain(int, char**, char**) (db.cpp:874)
      ==20184==    by 0x424029: main (db.cpp:685)
      

            Assignee:
            rassi J Rassi
            Reporter:
            rassi J Rassi
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

              Created:
              Updated:
              Resolved: