Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-17316

rc7 many threads "stuck" in pthread_cond_timedwait

    • Type: Icon: Bug Bug
    • Resolution: Done
    • Priority: Icon: Major - P3 Major - P3
    • 3.0.0-rc8
    • Affects Version/s: 3.0.0-rc7
    • Component/s: Storage
    • Fully Compatible
    • ALL
    • Hide

      benchrun.py (mongo-perf) update workloads on Linux or on Windows

      in or near Update.MmsIncShallow1
      in or near Update.IncFewSmallDoc
      in or near Update.v3.IncWithIndex

      Show
      benchrun.py (mongo-perf) update workloads on Linux or on Windows in or near Update.MmsIncShallow1 in or near Update.IncFewSmallDoc in or near Update.v3.IncWithIndex

      Just documenting this for our Project Manager, no symptoms seen in RC8 nor in RC9-pre.

      mongod running wiredTiger becomes unresponsive, perhaps just sufficiently slow so as to appear unresponsive. Multiple write operations appear "stuck" with many threads repeatedly waiting on pthread_cond_timedwait related to WT cache.

      attached to the Linux process with gdb and found it is spawning threads rapidly, with many threads looking for a condition variable, apparently a WT cache wait.

      (gdb) info threads
        15 Thread 0x7fdd10f36700 (LWP 6554)  0x00007fdd17d174b5 in sigwait ()
         from /lib64/libpthread.so.0
        14 Thread 0x7fdd10535700 (LWP 6555)  0x00007fdd17d1398e in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
        13 Thread 0x7fdd0fb34700 (LWP 6556)  0x00007fdd17d1398e in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
        12 Thread 0x7fdd0f133700 (LWP 6557)  0x00007fdd17d1398e in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
        11 Thread 0x7fdd0e732700 (LWP 6558)  0x00007fdd17d1398e in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
        10 Thread 0x7fdd0dd31700 (LWP 6559)  0x00007fdd17d1398e in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
        9 Thread 0x7fdd0d330700 (LWP 6560)  0x00007fdd17d135bc in pthread_cond_wait@@GLIBC_2.3.2 ()
         from /lib64/libpthread.so.0
        8 Thread 0x7fdd0c92f700 (LWP 6561)  0x00007fdd17d1398e in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
        7 Thread 0x7fdd0bf2e700 (LWP 6562)  0x00007fdd17d1398e in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
        6 Thread 0x7fdd0b52d700 (LWP 6563)  0x00007fdd17d1398e in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
        5 Thread 0x7fdd0ab2c700 (LWP 6564)  0x00007fdd17d1398e in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
        4 Thread 0x7fdd0a02a700 (LWP 6592)  0x00007fdd17d1398e in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
        3 Thread 0x7fdd09821700 (LWP 6693)  0x000000000131c582 in ?? ()
        2 Thread 0x7fdd08e20700 (LWP 6694)  0x000000000131c56b in ?? ()
      * 1 Thread 0x7fdd18138b60 (LWP 6553)  0x00007fdd16ea95d3 in select () from /lib64/libc.so.6
      
      
      
      (gdb) bt
      #0  0x00007fdd17d1398e in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
      #1  0x000000000133468f in __wt_cond_wait ()
      #2  0x000000000131c80c in __wt_cache_wait ()
      #3  0x00000000012cd8ca in __wt_btcur_search ()
      #4  0x00000000013078f3 in ?? ()
      #5  0x0000000000d55b31 in mongo::WiredTigerRecordStore::findRecord(mongo::OperationContext*, mongo::RecordId const&, mongo::RecordData*) const ()
      #6  0x0000000000cc9df1 in mongo::KVCatalog::_findEntry(mongo::OperationContext*, mongo::StringData const&, mongo::RecordId*) const ()
      #7  0x0000000000cca050 in mongo::KVCatalog::getMetaData(mongo::OperationContext*, mongo::StringData const&) ()
      #8  0x0000000000cceba5 in mongo::KVCollectionCatalogEntry::_getMetaData(mongo::OperationContext*) const ()
      #9  0x0000000000cae9c6 in mongo::BSONCollectionCatalogEntry::getTotalIndexCount(mongo::OperationContext*) const ()
      #10 0x00000000009bc0f1 in mongo::CmdDrop::run(mongo::OperationContext*, std::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, mongo::BSONObj&, int, std::basic_string<char, std::char_traits<char>, std::allocator<char> >&, mongo::BSONObjBuilder&, bool) ()
      #11 0x00000000009b71a4 in mongo::_execCommand(mongo::OperationContext*, mongo::Command*, std::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, mongo::BSONObj&, int, std::basic_string<char, std::char_traits<char>, std::allocator<char> >&, mongo::BSONObjBuilder&, bool) ()
      #12 0x00000000009b80e3 in mongo::Command::execCommand(mongo::OperationContext*, mongo::Command*, int, char const*, mongo::BSONObj&, mongo::BSONObjBuilder&, bool) ()
      #13 0x00000000009b8cdb in mongo::_runCommands(mongo::OperationContext*, char const*, mongo::BSONObj&, mongo::_BufBuilder<mongo::TrivialAllocator>&, mongo::BSONObjBuilder&, bool, int) ()
      #14 0x0000000000b87f95 in mongo::runQuery(mongo::OperationContext*, mongo::Message&, mongo::QueryMessage&, mongo::NamespaceString const&, mongo::CurOp&, mongo::Message&, bool) ()
      #15 0x0000000000a99f88 in mongo::assembleResponse(mongo::OperationContext*, mongo::Message&, mongo::DbResponse&, mongo::HostAndPort const&, bool) ()
      #16 0x00000000007e6730 in mongo::MyMessageHandler::process(mongo::Message&, mongo::AbstractMessagingPort*, mongo::LastError*) ()
      #17 0x0000000000ef8aab in mongo::PortMessageServer::handleIncomingMsg(void*) ()
      #18 0x00007fdd17d0f9d1 in start_thread () from /lib64/libpthread.so.0
      ---Type <return> to continue, or q <return> to quit---
      #19 0x00007fdd16eb0b5d in clone () from /lib64/libc.so.6
      

            Assignee:
            Unassigned Unassigned
            Reporter:
            quentin.conner Quentin Conner
            Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

              Created:
              Updated:
              Resolved: