Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-7516

exhaustReceiveMore() doesn't propagate error on recv()

    • ALL

      The recv call below swallows SocketExceptions and returns a success/fail value, but we ignore the failure case below:

          void DBClientCursor::exhaustReceiveMore() {
              verify( cursorId && batch.pos == batch.nReturned );
              verify( !haveLimit );
              auto_ptr<Message> response(new Message());
              verify( _client );
              if ( _client->recv(*response) ) {
                  batch.m = response;
                  dataReceived();
              }
          }
      

      The calls to exhaustReceiveMore() in DBClientConnection::query() calls this in a while loop, which only breaks if c->getCursorId() == 0. This seems to lead to an infinite loop where the logs produce nothing unless verbose output is enabled, at which point the logs show constant socket exceptions. The GDB backtrace during this state is as follows:

      Thread 112 (Thread 0x7f8f1f9c0700 (LWP 15045)):
      #0  0x00007f8f540bed60 in ?? () from /lib/x86_64-linux-gnu/libgcc_s.so.1
      No symbol table info available.
      #1  0x00007f8f540c0020 in ?? () from /lib/x86_64-linux-gnu/libgcc_s.so.1
      No symbol table info available.
      #2  0x00007f8f540c092e in _Unwind_RaiseException () from /lib/x86_64-linux-gnu/libgcc_s.so.1
      No symbol table info available.
      #3  0x00007f8f54605041 in __cxa_throw () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
      No symbol table info available.
      #4  0x0000000000aed72d in mongo::Socket::recv(char*, int) ()
      No symbol table info available.
      #5  0x0000000000ae4fec in mongo::MessagingPort::recv(mongo::Message&) ()
      No symbol table info available.
      #6  0x0000000000590e65 in mongo::DBClientConnection::recv(mongo::Message&) ()
      No symbol table info available.
      #7  0x00000000005cc1b1 in mongo::DBClientCursor::exhaustReceiveMore() ()
      No symbol table info available.
      ---Type <return> to continue, or q <return> to quit---
      #8  0x00000000005943f8 in mongo::DBClientConnection::query(boost::function<void (mongo::DBClientCursorBatchIterator&)>, std::string const&, mongo::Query, mongo::BSONObj const*, int) ()
      No symbol table info available.
      #9  0x0000000000672cea in mongo::Cloner::copy(char const*, char const*, bool, bool, bool, bool, bool, bool, mongo::Query) ()
      No symbol table info available.
      #10 0x000000000067408b in mongo::Cloner::go(char const*, mongo::CloneOptions const&, std::set<std::string, std::less<std::string>, std::allocator<std::string> >&, std::string&, int*) ()
      No symbol table info available.
      #11 0x0000000000676de3 in mongo::cloneFrom(std::string const&, mongo::CloneOptions const&, std::string&, int*, std::set<std::string, std::less<std::string>, std::allocator<std::string> >*) ()
      No symbol table info available.
      #12 0x000000000097eb99 in ?? ()
      No symbol table info available.
      #13 0x000000000097ef80 in mongo::ReplSetImpl::_syncDoInitialSync_clone(char const*, std::list<std::string, std::allocator<std::string> > const&, bool) ()
      No symbol table info available.
      #14 0x00000000009821f5 in mongo::ReplSetImpl::_syncDoInitialSync() ()
      No symbol table info available.
      #15 0x00000000009833c7 in mongo::ReplSetImpl::syncDoInitialSync() ()
      No symbol table info available.
      #16 0x000000000099edc8 in mongo::ReplSetImpl::_syncThread() ()
      No symbol table info available.
      #17 0x000000000099ee6a in mongo::ReplSetImpl::syncThread() ()
      No symbol table info available.
      #18 0x000000000099f27a in mongo::startSyncThread() ()
      No symbol table info available.
      #19 0x0000000000b3ec79 in ?? ()
      No symbol table info available.
      #20 0x00007f8f54a61efc in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0
      No symbol table info available.
      #21 0x00007f8f53df389d in clone () from /lib/x86_64-linux-gnu/libc.so.6
      No symbol table info available.
      #22 0x0000000000000000 in ?? ()
      No symbol table info available.
      
      ... After 'catch catch' ...
      
      Catchpoint 1 (exception caught), 0x00007f8f54603f70 in __cxa_begin_catch () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
      (gdb) bt
      #0  0x00007f8f54603f70 in __cxa_begin_catch () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
      #1  0x0000000000ae55e6 in mongo::MessagingPort::recv(mongo::Message&) ()
      #2  0x0000000000590e65 in mongo::DBClientConnection::recv(mongo::Message&) ()
      #3  0x00000000005cc1b1 in mongo::DBClientCursor::exhaustReceiveMore() ()
      #4  0x00000000005943f8 in mongo::DBClientConnection::query(boost::function<void (mongo::DBClientCursorBatchIterator&)>, std::string const&, mongo::Query, mongo::BSONObj const*, int) ()
      #5  0x0000000000672cea in mongo::Cloner::copy(char const*, char const*, bool, bool, bool, bool, bool, bool, mongo::Query) ()
      #6  0x000000000067408b in mongo::Cloner::go(char const*, mongo::CloneOptions const&, std::set<std::string, std::less<std::string>, std::allocator<std::string> >&, std::string&, int*) ()
      #7  0x0000000000676de3 in mongo::cloneFrom(std::string const&, mongo::CloneOptions const&, std::string&, int*, std::set<std::string, std::less<std::string>, std::allocator<std::string> >*) ()
      #8  0x000000000097eb99 in ?? ()
      #9  0x000000000097ef80 in mongo::ReplSetImpl::_syncDoInitialSync_clone(char const*, std::list<std::string, std::allocator<std::string> > const&, bool) ()
      #10 0x00000000009821f5 in mongo::ReplSetImpl::_syncDoInitialSync() ()
      #11 0x00000000009833c7 in mongo::ReplSetImpl::syncDoInitialSync() ()
      #12 0x000000000099edc8 in mongo::ReplSetImpl::_syncThread() ()
      #13 0x000000000099ee6a in mongo::ReplSetImpl::syncThread() ()
      #14 0x000000000099f27a in mongo::startSyncThread() ()
      #15 0x0000000000b3ec79 in ?? ()
      #16 0x00007f8f54a61efc in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0
      #17 0x00007f8f53df389d in clone () from /lib/x86_64-linux-gnu/libc.so.6
      #18 0x0000000000000000 in ?? ()
      

            Assignee:
            milkie@mongodb.com Eric Milkie
            Reporter:
            benjamin.becker Ben Becker
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated:
              Resolved: