[SERVER-17317] Deadlock or starvation in WiredTiger from benchmark with 10 insert threads, 1 query thread Created: 18/Feb/15  Updated: 20/Mar/15  Resolved: 10/Mar/15

Status: Closed
Project: Core Server
Component/s: WiredTiger
Affects Version/s: 3.0.0-rc8
Fix Version/s: 3.0.0-rc9

Type: Bug Priority: Major - P3
Reporter: Mark Callaghan Assignee: Michael Cahill (Inactive)
Resolution: Done Votes: 0
Labels: wiredtiger
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: Java Source File jmongoiibench.java     Java Source File jmongoiibench.java     File mongo.conf     File o.s.1     File run.simple.bash    
Backwards Compatibility: Fully Compatible
Operating System: ALL
Participants:

 Description   

I have been using iibench-mongodb with 10 insert threads and 0 query threads. Today I tried it with 10 insert threads and 1 query thread, WiredTiger with zlib and no oplog (because turning on the oplog gets https://jira.mongodb.org/browse/SERVER-17141).

The test was OK for the first 50M inserts (~1800 seconds). Then it goes into a state where over the next 1800 seconds there are only inserts and queries for two 10-second intervals over the next 1800 seconds.

Thread[Thread-2,5,main]50,394,364 inserts : 1,750 seconds : cum ips=28,795.55 : int ips=10,256.89 : cum avg qry=2.11 : int avg qry=0.00 : cum avg qps=326.81 : int avg qps=0.00 : exceptions=0
Thread[Thread-2,5,main]50,516,553 inserts : 1,760 seconds : cum ips=28,701.23 : int ips=12,209.13 : cum avg qry=2.11 : int avg qry=0.00 : cum avg qps=324.94 : int avg qps=0.00 : exceptions=0
Thread[Thread-2,5,main]50,608,966 inserts : 1,770 seconds : cum ips=28,591.15 : int ips=9,232.99 : cum avg qry=2.11 : int avg qry=0.00 : cum avg qps=323.10 : int avg qps=0.00 : exceptions=0
Thread[Thread-2,5,main]50,725,511 inserts : 1,780 seconds : cum ips=28,495.89 : int ips=11,646.35 : cum avg qry=2.11 : int avg qry=0.00 : cum avg qps=321.27 : int avg qps=0.00 : exceptions=0
Thread[Thread-2,5,main]50,801,222 inserts : 1,790 seconds : cum ips=28,380.46 : int ips=7,641.40 : cum avg qry=2.11 : int avg qry=0.00 : cum avg qps=319.49 : int avg qps=0.00 : exceptions=0
Thread[Thread-2,5,main]50,900,884 inserts : 1,800 seconds : cum ips=28,278.05 : int ips=9,959.23 : cum avg qry=2.11 : int avg qry=0.00 : cum avg qps=317.70 : int avg qps=0.00 : exceptions=0
Thread[Thread-2,5,main]50,905,313 inserts : 1,810 seconds : cum ips=28,124.16 : int ips=442.59 : cum avg qry=2.11 : int avg qry=0.00 : cum avg qps=315.94 : int avg qps=0.00 : exceptions=0
Thread[Thread-2,5,main]50,905,313 inserts : 1,820 seconds : cum ips=27,969.52 : int ips=0.00 : cum avg qry=2.11 : int avg qry=0.00 : cum avg qps=314.20 : int avg qps=0.00 : exceptions=0
Thread[Thread-2,5,main]50,905,314 inserts : 1,830 seconds : cum ips=27,816.58 : int ips=0.10 : cum avg qry=2.11 : int avg qry=0.00 : cum avg qps=312.47 : int avg qps=0.00 : exceptions=0
Thread[Thread-2,5,main]50,905,314 inserts : 1,840 seconds : cum ips=27,665.28 : int ips=0.00 : cum avg qry=2.11 : int avg qry=0.00 : cum avg qps=310.77 : int avg qps=0.00 : exceptions=0
Thread[Thread-2,5,main]50,905,314 inserts : 1,850 seconds : cum ips=27,515.64 : int ips=0.00 : cum avg qry=2.11 : int avg qry=0.00 : cum avg qps=309.08 : int avg qps=0.00 : exceptions=0
Thread[Thread-2,5,main]50,905,314 inserts : 1,860 seconds : cum ips=27,367.61 : int ips=0.00 : cum avg qry=2.11 : int avg qry=0.00 : cum avg qps=307.41 : int avg qps=0.00 : exceptions=0
... long period with no inserts, no queries
Thread[Thread-2,5,main]50,905,314 inserts : 2,800 seconds : cum ips=18,179.83 : int ips=0.00 : cum avg qry=2.11 : int avg qry=0.00 : cum avg qps=203.95 : int avg qps=0.00 : exceptions=10
Thread[Thread-2,5,main]50,905,314 inserts : 2,810 seconds : cum ips=18,115.72 : int ips=0.00 : cum avg qry=2.11 : int avg qry=0.00 : cum avg qps=203.23 : int avg qps=0.00 : exceptions=10
Thread[Thread-2,5,main]50,905,314 inserts : 2,820 seconds : cum ips=18,051.41 : int ips=0.00 : cum avg qry=2.11 : int avg qry=0.00 : cum avg qps=202.50 : int avg qps=0.00 : exceptions=10
Thread[Thread-2,5,main]50,905,314 inserts : 2,830 seconds : cum ips=17,987.57 : int ips=0.00 : cum avg qry=2.11 : int avg qry=0.00 : cum avg qps=201.79 : int avg qps=0.00 : exceptions=10
Thread[Thread-2,5,main]50,905,314 inserts : 2,840 seconds : cum ips=17,924.18 : int ips=0.00 : cum avg qry=2.11 : int avg qry=0.00 : cum avg qps=201.07 : int avg qps=0.00 : exceptions=10
Thread[Thread-2,5,main]51,014,662 inserts : 2,850 seconds : cum ips=17,899.59 : int ips=10,922.78 : cum avg qry=2.11 : int avg qry=0.00 : cum avg qps=200.37 : int avg qps=0.00 : exceptions=10
Thread[Thread-2,5,main]51,016,813 inserts : 2,860 seconds : cum ips=17,837.70 : int ips=214.93 : cum avg qry=2.11 : int avg qry=0.00 : cum avg qps=199.66 : int avg qps=0.00 : exceptions=10
Thread[Thread-2,5,main]51,016,813 inserts : 2,870 seconds : cum ips=17,775.50 : int ips=0.00 : cum avg qry=2.11 : int avg qry=0.00 : cum avg qps=198.96 : int avg qps=0.00 : exceptions=10
Thread[Thread-2,5,main]51,016,813 inserts : 2,880 seconds : cum ips=17,713.74 : int ips=0.00 : cum avg qry=2.11 : int avg qry=0.00 : cum avg qps=198.27 : int avg qps=0.00 : exceptions=10
Thread[Thread-2,5,main]51,016,813 inserts : 2,890 seconds : cum ips=17,652.40 : int ips=0.00 : cum avg qry=2.11 : int avg qry=0.00 : cum avg qps=197.58 : int avg qps=0.00 : exceptions=10
Thread[Thread-2,5,main]51,016,813 inserts : 2,900 seconds : cum ips=17,591.48 : int ips=0.00 : cum avg qry=2.11 : int avg qry=0.00 : cum avg qps=196.90 : int avg qps=0.00 : exceptions=10
... another long period with no queries, no inserts
Thread[Thread-2,5,main]51,016,813 inserts : 3,250 seconds : cum ips=15,697.20 : int ips=0.00 : cum avg qry=2.11 : int avg qry=0.00 : cum avg qps=175.65 : int avg qps=0.00 : exceptions=10
Thread[Thread-2,5,main]51,016,813 inserts : 3,260 seconds : cum ips=15,649.01 : int ips=0.00 : cum avg qry=2.11 : int avg qry=0.00 : cum avg qps=175.11 : int avg qps=0.00 : exceptions=10
Thread[Thread-2,5,main]51,016,813 inserts : 3,270 seconds : cum ips=15,601.11 : int ips=0.00 : cum avg qry=2.11 : int avg qry=0.00 : cum avg qps=174.57 : int avg qps=0.00 : exceptions=10
Thread[Thread-2,5,main]51,016,813 inserts : 3,280 seconds : cum ips=15,553.51 : int ips=0.00 : cum avg qry=2.11 : int avg qry=0.00 : cum avg qps=174.04 : int avg qps=0.00 : exceptions=10
Thread[Thread-2,5,main]51,016,813 inserts : 3,290 seconds : cum ips=15,506.21 : int ips=0.00 : cum avg qry=2.11 : int avg qry=0.00 : cum avg qps=173.51 : int avg qps=0.00 : exceptions=10
... and still no queries, no inserts

Prior to the two 10-second intervals when there was a brief amount of throughput, the common thread stacks are:

     14 pthread_cond_timedwait@@GLIBC_2.3.2,__wt_cond_wait,__wt_cache_wait,__wt_cache_full_check,__cursor_enter,__curfile_enter,__cursor_func_init,__wt_btcur_insert,__curfile_insert,mongo::WiredTigerRecordStore::insertRecord,mongo::Collection::_insertDocument,mongo::Collection::insertDocument,singleInsert,insertOne,mongo::WriteBatchExecutor::execOneInsert,mongo::WriteBatchExecutor::execInserts,mongo::WriteBatchExecutor::bulkExecute,mongo::WriteBatchExecutor::executeBatch,mongo::WriteCmd::run,mongo::_execCommand,mongo::Command::execCommand,mongo::_runCommands,runCommands,mongo::runQuery,receivedQuery,mongo::assembleResponse,mongo::MyMessageHandler::process,mongo::PortMessageServer::handleIncomingMsg,start_thread,clone
      4 pthread_cond_timedwait@@GLIBC_2.3.2,__wt_cond_wait,__wt_cache_wait,__wt_cache_full_check,__cursor_enter,__curfile_enter,__cursor_func_init,__wt_btcur_insert,__curfile_insert,mongo::WiredTigerIndexStandard::_insert,mongo::WiredTigerIndex::insert,mongo::BtreeBasedAccessMethod::insert,mongo::IndexCatalog::_indexRecord,mongo::IndexCatalog::indexRecord,mongo::Collection::_insertDocument,mongo::Collection::insertDocument,singleInsert,insertOne,mongo::WriteBatchExecutor::execOneInsert,mongo::WriteBatchExecutor::execInserts,mongo::WriteBatchExecutor::bulkExecute,mongo::WriteBatchExecutor::executeBatch,mongo::WriteCmd::run,mongo::_execCommand,mongo::Command::execCommand,mongo::_runCommands,runCommands,mongo::runQuery,receivedQuery,mongo::assembleResponse,mongo::MyMessageHandler::process,mongo::PortMessageServer::handleIncomingMsg,start_thread,clone
      2 pthread_cond_timedwait@@GLIBC_2.3.2,__wt_cond_wait,__wt_cache_wait,__wt_cache_full_check,__cursor_enter,__curfile_enter,__cursor_func_init,__wt_btcur_insert,__curfile_insert,mongo::WiredTigerIndexUnique::_insert,mongo::WiredTigerIndex::insert,mongo::BtreeBasedAccessMethod::insert,mongo::IndexCatalog::_indexRecord,mongo::IndexCatalog::indexRecord,mongo::Collection::_insertDocument,mongo::Collection::insertDocument,singleInsert,insertOne,mongo::WriteBatchExecutor::execOneInsert,mongo::WriteBatchExecutor::execInserts,mongo::WriteBatchExecutor::bulkExecute,mongo::WriteBatchExecutor::executeBatch,mongo::WriteCmd::run,mongo::_execCommand,mongo::Command::execCommand,mongo::_runCommands,runCommands,mongo::runQuery,receivedQuery,mongo::assembleResponse,mongo::MyMessageHandler::process,mongo::PortMessageServer::handleIncomingMsg,start_thread,clone
      1 sigwait,mongo::(anonymous,boost::(anonymous,start_thread,clone
      1 select,__wt_sleep,__evict_pass,__evict_server,start_thread,clone
      1 select,mongo::Listener::initAndListen,_initAndListen,mongo::initAndListen,mongoDbMain,main
      1 recv,mongo::Socket::_recv,mongo::Socket::unsafe_recv,mongo::Socket::recv,mongo::MessagingPort::recv,mongo::PortMessageServer::handleIncomingMsg,start_thread,clone
      1 pthread_cond_wait@@GLIBC_2.3.2,wait<boost::unique_lock<boost::timed_mutex>,mongo::DeadlineMonitor<mongo::V8Scope>::deadlineMonitorThread,boost::(anonymous,start_thread,clone
      1 pthread_cond_timedwait@@GLIBC_2.3.2,__wt_cond_wait,__wt_cache_wait,__wt_cache_full_check,__wt_page_in_func,__wt_page_swap_func,__wt_row_search,__cursor_row_search,__wt_btcur_search,__curfile_search,mongo::WiredTigerRecordStore::findRecord,mongo::KVCatalog::_findEntry,mongo::KVCatalog::getMetaData,mongo::KVCollectionCatalogEntry::_getMetaData,mongo::BSONCollectionCatalogEntry::getAllIndexes,mongo::TTLMonitor::getTTLIndexesForDB,mongo::TTLMonitor::doTTLPass,mongo::TTLMonitor::run,mongo::BackgroundJob::jobBody,boost::(anonymous,start_thread,clone
      1 pthread_cond_timedwait@@GLIBC_2.3.2,__wt_cond_wait,__wt_cache_wait,__wt_cache_full_check,__cursor_enter,__curfile_enter,__cursor_func_init,__wt_btcur_insert,__curfile_insert,mongo::WiredTigerSizeStorer::storeInto,mongo::WiredTigerKVEngine::syncSizeInfo,mongo::WiredTigerKVEngine::haveDropsQueued,mongo::WiredTigerSessionCache::releaseSession,mongo::WiredTigerRecoveryUnit::~WiredTigerRecoveryUnit,mongo::WiredTigerRecoveryUnit::~WiredTigerRecoveryUnit,mongo::MyMessageHandler::process,mongo::PortMessageServer::handleIncomingMsg,start_thread,clone
      1 pthread_cond_timedwait@@GLIBC_2.3.2,__wt_cond_wait,__sweep_server,start_thread,clone
      1 pthread_cond_timedwait@@GLIBC_2.3.2,__wt_cond_wait,__log_server,start_thread,clone
      1 pthread_cond_timedwait@@GLIBC_2.3.2,__wt_cond_wait,__log_close_server,start_thread,clone
      1 pthread_cond_timedwait@@GLIBC_2.3.2,timed_wait<boost::unique_lock<boost::timed_mutex>,timed_wait<boost::unique_lock<boost::timed_mutex>,,mongo::RangeDeleter::doWork,boost::(anonymous,start_thread,clone
      1 pthread_cond_timedwait@@GLIBC_2.3.2,boost::condition_variable_any::timed_wait<boost::unique_lock<boost::timed_mutex>,timed_wait<boost::unique_lock<boost::timed_mutex>,,timed_wait<boost::unique_lock<boost::timed_mutex>,,mongo::(anonymous,mongo::BackgroundJob::jobBody,boost::(anonymous,start_thread,clone
      1 nanosleep,mongo::sleepsecs,mongo::ClientCursorMonitor::run,mongo::BackgroundJob::jobBody,boost::(anonymous,start_thread,clone
      1 free,__wt_free_int,__free_skip_list,__free_skip_array,__free_page_row_leaf,__wt_page_out,__wt_ref_out,__wt_split_rewrite,__evict_page_dirty_update,__wt_evict,__wt_evict_page,__wt_page_release_evict,__wt_page_in_func,__wt_page_swap_func,__wt_row_search,__cursor_row_search,__wt_btcur_search,__curfile_search,mongo::WiredTigerRecordStore::dataFor,mongo::Collection::docFor,mongo::FetchStage::work,mongo::ProjectionStage::work,mongo::MultiPlanStage::workAllPlans,mongo::MultiPlanStage::pickBestPlan,mongo::SubplanStage::choosePlanForSubqueries,mongo::SubplanStage::pickBestPlan,mongo::PlanExecutor::pickBestPlan,mongo::PlanExecutor::make,mongo::PlanExecutor::make,mongo::getExecutor,mongo::runQuery,receivedQuery,mongo::assembleResponse,mongo::MyMessageHandler::process,mongo::PortMessageServer::handleIncomingMsg,start_thread,clone
      1 __evict_lru_pages,__evict_worker,start_thread,clone
      1 __evict_list_clear,__evict_get_ref,__wt_evict_lru_page,__evict_lru_pages,__evict_worker,start_thread,clone
      1 __evict_get_ref,__wt_evict_lru_page,__evict_lru_pages,__evict_worker,start_thread,clone
      1 deflate_slow,deflate,zlib_compress_raw,__rec_split_raw_worker,__rec_split_raw,__rec_row_leaf_insert,__rec_row_leaf,__wt_reconcile,__sync_file,__wt_cache_op,__checkpoint_write_leaves,__checkpoint_apply,__wt_txn_checkpoint,__session_checkpoint,__ckpt_server,start_thread,clone

Now, after the 2 10-second intervals with throughput the thread stacks are listed here. I also ran "db.serverStatus()" and that hung so one of the threads below is from it. But it hung after producing output that I will upload.

      5 pthread_cond_timedwait@@GLIBC_2.3.2,__wt_cond_wait,__wt_cache_wait,__wt_cache_full_check,__cursor_enter,__curfile_enter,__cursor_func_init,__wt_btcur_insert,__curfile_insert,mongo::WiredTigerRecordStore::insertRecord,mongo::Collection::_insertDocument,mongo::Collection::insertDocument,singleInsert,insertOne,mongo::WriteBatchExecutor::execOneInsert,mongo::WriteBatchExecutor::execInserts,mongo::WriteBatchExecutor::bulkExecute,mongo::WriteBatchExecutor::executeBatch,mongo::WriteCmd::run,mongo::_execCommand,mongo::Command::execCommand,mongo::_runCommands,runCommands,mongo::runQuery,receivedQuery,mongo::assembleResponse,mongo::MyMessageHandler::process,mongo::PortMessageServer::handleIncomingMsg,start_thread,clone
      3 pthread_cond_timedwait@@GLIBC_2.3.2,__wt_cond_wait,__wt_cache_wait,__wt_cache_full_check,__cursor_enter,__curfile_enter,__cursor_func_init,__wt_btcur_insert,__curfile_insert,mongo::WiredTigerIndexUnique::_insert,mongo::WiredTigerIndex::insert,mongo::BtreeBasedAccessMethod::insert,mongo::IndexCatalog::_indexRecord,mongo::IndexCatalog::indexRecord,mongo::Collection::_insertDocument,mongo::Collection::insertDocument,singleInsert,insertOne,mongo::WriteBatchExecutor::execOneInsert,mongo::WriteBatchExecutor::execInserts,mongo::WriteBatchExecutor::bulkExecute,mongo::WriteBatchExecutor::executeBatch,mongo::WriteCmd::run,mongo::_execCommand,mongo::Command::execCommand,mongo::_runCommands,runCommands,mongo::runQuery,receivedQuery,mongo::assembleResponse,mongo::MyMessageHandler::process,mongo::PortMessageServer::handleIncomingMsg,start_thread,clone
      2 pthread_cond_timedwait@@GLIBC_2.3.2,__wt_cond_wait,__wt_cache_wait,__wt_cache_full_check,__cursor_enter,__curfile_enter,__cursor_func_init,__wt_btcur_insert,__curfile_insert,mongo::WiredTigerIndexStandard::_insert,mongo::WiredTigerIndex::insert,mongo::BtreeBasedAccessMethod::insert,mongo::IndexCatalog::_indexRecord,mongo::IndexCatalog::indexRecord,mongo::Collection::_insertDocument,mongo::Collection::insertDocument,singleInsert,insertOne,mongo::WriteBatchExecutor::execOneInsert,mongo::WriteBatchExecutor::execInserts,mongo::WriteBatchExecutor::bulkExecute,mongo::WriteBatchExecutor::executeBatch,mongo::WriteCmd::run,mongo::_execCommand,mongo::Command::execCommand,mongo::_runCommands,runCommands,mongo::runQuery,receivedQuery,mongo::assembleResponse,mongo::MyMessageHandler::process,mongo::PortMessageServer::handleIncomingMsg,start_thread,clone
      1 __wt_evict_lru_page,__evict_lru_pages,__evict_worker,start_thread,clone
      1 sigwait,mongo::(anonymous,boost::(anonymous,start_thread,clone
      1 select,mongo::Listener::initAndListen,_initAndListen,mongo::initAndListen,mongoDbMain,main
      1 recv,mongo::Socket::_recv,mongo::Socket::unsafe_recv,mongo::Socket::recv,mongo::MessagingPort::recv,mongo::PortMessageServer::handleIncomingMsg,start_thread,clone
      1 pthread_cond_wait@@GLIBC_2.3.2,wait<boost::unique_lock<boost::timed_mutex>,mongo::DeadlineMonitor<mongo::V8Scope>::deadlineMonitorThread,boost::(anonymous,start_thread,clone
      1 pthread_cond_timedwait@@GLIBC_2.3.2,__wt_cond_wait,__wt_cache_wait,__wt_cache_full_check,__cursor_enter,__curfile_enter,__cursor_func_init,__wt_btcur_search,__curfile_search,mongo::WiredTigerRecordStore::findRecord,mongo::KVCatalog::_findEntry,mongo::KVCatalog::getMetaData,mongo::KVCollectionCatalogEntry::_getMetaData,mongo::BSONCollectionCatalogEntry::getAllIndexes,mongo::TTLMonitor::getTTLIndexesForDB,mongo::TTLMonitor::doTTLPass,mongo::TTLMonitor::run,mongo::BackgroundJob::jobBody,boost::(anonymous,start_thread,clone
      1 pthread_cond_timedwait@@GLIBC_2.3.2,__wt_cond_wait,__wt_cache_wait,__wt_cache_full_check,__cursor_enter,__curfile_enter,__cursor_func_init,__wt_btcur_insert,__curfile_insert,mongo::WiredTigerSizeStorer::storeInto,mongo::WiredTigerKVEngine::syncSizeInfo,mongo::WiredTigerKVEngine::haveDropsQueued,mongo::WiredTigerSessionCache::releaseSession,mongo::WiredTigerRecoveryUnit::~WiredTigerRecoveryUnit,mongo::WiredTigerRecoveryUnit::~WiredTigerRecoveryUnit,mongo::MyMessageHandler::process,mongo::PortMessageServer::handleIncomingMsg,start_thread,clone
      1 pthread_cond_timedwait@@GLIBC_2.3.2,__wt_cond_wait,__sweep_server,start_thread,clone
      1 pthread_cond_timedwait@@GLIBC_2.3.2,__wt_cond_wait,__log_close_server,start_thread,clone
      1 pthread_cond_timedwait@@GLIBC_2.3.2,__wt_cond_wait,__evict_server,start_thread,clone
      1 pthread_cond_timedwait@@GLIBC_2.3.2,timed_wait<boost::unique_lock<boost::timed_mutex>,timed_wait<boost::unique_lock<boost::timed_mutex>,,mongo::RangeDeleter::doWork,boost::(anonymous,start_thread,clone
      1 pthread_cond_timedwait@@GLIBC_2.3.2,boost::condition_variable_any::timed_wait<boost::unique_lock<boost::timed_mutex>,timed_wait<boost::unique_lock<boost::timed_mutex>,,timed_wait<boost::unique_lock<boost::timed_mutex>,,mongo::(anonymous,mongo::BackgroundJob::jobBody,boost::(anonymous,start_thread,clone
      1 nanosleep,mongo::sleepsecs,mongo::ClientCursorMonitor::run,mongo::BackgroundJob::jobBody,boost::(anonymous,start_thread,clone
      1 __lll_lock_wait,pthread_mutex_lock,??,??,??
      1 __evict_worker,start_thread,clone
      1 deflate_slow,deflate,zlib_compress_raw,__rec_split_raw_worker,__rec_split_raw,__rec_row_leaf_insert,__rec_row_leaf,__wt_reconcile,__sync_file,__wt_cache_op,__checkpoint_worker,__wt_checkpoint,__checkpoint_apply,__wt_txn_checkpoint,__session_checkpoint,__ckpt_server,start_thread,clone



 Comments   
Comment by Mark Callaghan [ 24/Feb/15 ]

I have not seen this in rc9, however there are still periods of 3-5 minutes while running iibench-mongodb where throughput collapses (from 6000/second to less than 500/second). That is covered by https://jira.mongodb.org/browse/SERVER-16575

Comment by Asya Kamsky [ 18/Feb/15 ]

mdcallag I will try to reproduce on latest 3.0 (which is close to what rc9 will be). If I cannot but I can reproduce in rc8 then it may have been addressed by one of the fixes mentioned in SERVER-16958

Comment by Mark Callaghan [ 18/Feb/15 ]

Grepping for "evict".

From current PMP stack dump.

      1 __wt_evict_lru_page,__evict_lru_pages,__evict_worker,start_thread,clone
      1 pthread_cond_timedwait@@GLIBC_2.3.2,__wt_cond_wait,__evict_server,start_thread,clone
      1 __evict_worker,start_thread,clone

From a few minutes back

      2 __evict_has_work,__evict_worker,start_thread,clone
      1 __wt_tree_walk,__evict_walk_file,__evict_walk,__evict_lru_walk,__evict_pass,__evict_server,start_thread,clone
      1 __wt_evict_lru_page,__evict_lru_pages,__evict_worker,start_thread,clone

From before that

      2 __evict_has_work,__evict_worker,start_thread,clone
      1 __wt_row_modify,__split_multi_inmem,__wt_split_rewrite,__evict_page_dirty_update,__wt_evict,__wt_evict_page,__wt_page_release_evict,__wt_page_in_func,__wt_page_swap_func,__wt_row_search,__cursor_row_search,__wt_btcur_search,__curfile_search,mongo::WiredTigerRecordStore::dataFor,mongo::Collection::docFor,mongo::FetchStage::work,mongo::ProjectionStage::work,mongo::MultiPlanStage::workAllPlans,mongo::MultiPlanStage::pickBestPlan,mongo::SubplanStage::choosePlanForSubqueries,mongo::SubplanStage::pickBestPlan,mongo::PlanExecutor::pickBestPlan,mongo::PlanExecutor::make,mongo::PlanExecutor::make,mongo::getExecutor,mongo::runQuery,receivedQuery,mongo::assembleResponse,mongo::MyMessageHandler::process,mongo::PortMessageServer::handleIncomingMsg,start_thread,clone
      1 select,__wt_sleep,__evict_pass,__evict_server,start_thread,clone
      1 __evict_lru_pages,__evict_worker,start_thread,clone

Top-N CPU sources from flat perf profile:

    39.49%  mongod.30rc8.je  mongod.30rc8.jem.70g.zlib  [.] __evict_worker                         
    22.27%  mongod.30rc8.je  mongod.30rc8.jem.70g.zlib  [.] __wt_evict_lru_page                    
    14.83%  mongod.30rc8.je  mongod.30rc8.jem.70g.zlib  [.] deflate_slow                           
     7.98%  mongod.30rc8.je  mongod.30rc8.jem.70g.zlib  [.] longest_match                          
     4.10%  mongod.30rc8.je  mongod.30rc8.jem.70g.zlib  [.] compress_block                         
     2.52%  mongod.30rc8.je  mongod.30rc8.jem.70g.zlib  [.] __rec_row_leaf_insert                  
     2.08%  mongod.30rc8.je  mongod.30rc8.jem.70g.zlib  [.] build_tree                             
     1.68%  mongod.30rc8.je  libc-2.20.so               [.] __memmove_avx_unaligned                
     1.33%  mongod.30rc8.je  mongod.30rc8.jem.70g.zlib  [.] __rec_cell_build_leaf_key              
     0.71%  mongod.30rc8.je  mongod.30rc8.jem.70g.zlib  [.] __rec_row_leaf                         
     0.64%  mongod.30rc8.je  mongod.30rc8.jem.70g.zlib  [.] adler32                 

And harder to read hierarchical perf:

    39.59%  mongod.30rc8.je  mongod.30rc8.jem.70g.zlib  [.] __evict_worker                                            
            |
            --- __evict_worker
                __conn_async_new_op
                0x48705f8b48530036
 
    21.88%  mongod.30rc8.je  mongod.30rc8.jem.70g.zlib  [.] __wt_evict_lru_page                                       
            |
            --- __wt_evict_lru_page
               |          
               |--99.96%-- __conn_async_new_op
               |          0x48705f8b48530036
                --0.04%-- [...]
 
    14.89%  mongod.30rc8.je  mongod.30rc8.jem.70g.zlib  [.] deflate_slow                                              
            |
            --- deflate_slow
               |          
               |--0.16%-- 0x0
               |          0x695f0700000847c7
               |          
               |--0.01%-- 0x1238
               |          0x6161616161616161
               |          
               |--0.01%-- 0x71
               |          0x0
               |          0x695f0700000847c7
                --99.83%-- [...]
 
     7.90%  mongod.30rc8.je  mongod.30rc8.jem.70g.zlib  [.] longest_match                                             
            |
            --- longest_match
 
     4.05%  mongod.30rc8.je  mongod.30rc8.jem.70g.zlib  [.] compress_block                                            
            |
            --- compress_block
               |          
               |--11.26%-- 0xb0b0a0a09090808
                --88.74%-- [...]
 
     2.56%  mongod.30rc8.je  mongod.30rc8.jem.70g.zlib  [.] __rec_row_leaf_insert                                     
            |
            --- __rec_row_leaf_insert
               |          
               |--99.94%-- 0x0
               |          __conn_async_new_op
               |          0x48705f8b48530036
                --0.06%-- [...]
 

Comment by Asya Kamsky [ 18/Feb/15 ]

Hi mdcallag,

I this it's possible this may be the same or similar to what we see in SERVER-16958 where there's an attempted eviction during checkpoint which gets things stuck forever (but eventually finishes, as in hours later).

This was a typical stack trace (for me):

Thread 728 (Thread 0x7f41e6f0d700 (LWP 19977)):
#0  memset () at ../sysdeps/x86_64/memset.S:94
#1  0x00000000013181d5 in __wt_realloc ()
#2  0x0000000001325bab in ?? ()
#3  0x000000000132b29c in ?? ()
#4  0x0000000001330de1 in ?? ()
#5  0x000000000133173a in __wt_reconcile ()
#6  0x0000000001302252 in ?? ()
#7  0x0000000001302a56 in __wt_evict ()
#8  0x000000000130044d in __wt_evict_page ()
#9  0x00000000012be5d2 in __wt_page_in_func ()
#10 0x00000000012d0717 in __wt_tree_walk ()
#11 0x00000000012ca6b0 in __wt_cache_op ()
#12 0x000000000135168a in __wt_txn_checkpoint ()
#13 0x0000000001345bc6 in ?? ()
#14 0x00000000012e0bcc in ?? ()
#15 0x00007f41ea9d6182 in start_thread (arg=0x7f41e6f0d700) at pthread_create.c:312
#16 0x00007f41e9ad6fbd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

Do you see any of the threads in your stall with this sequence (txn_checkpoint and evict_page)?

Comment by Mark Callaghan [ 18/Feb/15 ]

Test is now at 2000 seconds, still no progress. All user threads stuck in __wt_cache_full_check

      6 pthread_cond_timedwait@@GLIBC_2.3.2,__wt_cond_wait,__wt_cache_wait,__wt_cache_full_check,__cursor_enter,__curfile_enter,__cursor_func_init,__wt_btcur_insert,__curfile_insert,mongo::WiredTigerRecordStore::insertRecord,mongo::Collection::_insertDocument,mongo::Collection::insertDocument,singleInsert,insertOne,mongo::WriteBatchExecutor::execOneInsert,mongo::WriteBatchExecutor::execInserts,mongo::WriteBatchExecutor::bulkExecute,mongo::WriteBatchExecutor::executeBatch,mongo::WriteCmd::run,mongo::_execCommand,mongo::Command::execCommand,mongo::_runCommands,runCommands,mongo::runQuery,receivedQuery,mongo::assembleResponse,mongo::MyMessageHandler::process,mongo::PortMessageServer::handleIncomingMsg,start_thread,clone
      2 pthread_cond_timedwait@@GLIBC_2.3.2,__wt_cond_wait,__wt_cache_wait,__wt_cache_full_check,__cursor_enter,__curfile_enter,__cursor_func_init,__wt_btcur_insert,__curfile_insert,mongo::WiredTigerIndexUnique::_insert,mongo::WiredTigerIndex::insert,mongo::BtreeBasedAccessMethod::insert,mongo::IndexCatalog::_indexRecord,mongo::IndexCatalog::indexRecord,mongo::Collection::_insertDocument,mongo::Collection::insertDocument,singleInsert,insertOne,mongo::WriteBatchExecutor::execOneInsert,mongo::WriteBatchExecutor::execInserts,mongo::WriteBatchExecutor::bulkExecute,mongo::WriteBatchExecutor::executeBatch,mongo::WriteCmd::run,mongo::_execCommand,mongo::Command::execCommand,mongo::_runCommands,runCommands,mongo::runQuery,receivedQuery,mongo::assembleResponse,mongo::MyMessageHandler::process,mongo::PortMessageServer::handleIncomingMsg,start_thread,clone
      2 pthread_cond_timedwait@@GLIBC_2.3.2,__wt_cond_wait,__wt_cache_wait,__wt_cache_full_check,__cursor_enter,__curfile_enter,__cursor_func_init,__wt_btcur_insert,__curfile_insert,mongo::WiredTigerIndexStandard::_insert,mongo::WiredTigerIndex::insert,mongo::BtreeBasedAccessMethod::insert,mongo::IndexCatalog::_indexRecord,mongo::IndexCatalog::indexRecord,mongo::Collection::_insertDocument,mongo::Collection::insertDocument,singleInsert,insertOne,mongo::WriteBatchExecutor::execOneInsert,mongo::WriteBatchExecutor::execInserts,mongo::WriteBatchExecutor::bulkExecute,mongo::WriteBatchExecutor::executeBatch,mongo::WriteCmd::run,mongo::_execCommand,mongo::Command::execCommand,mongo::_runCommands,runCommands,mongo::runQuery,receivedQuery,mongo::assembleResponse,mongo::MyMessageHandler::process,mongo::PortMessageServer::handleIncomingMsg,start_thread,clone
      2 __evict_has_work,__evict_worker,start_thread,clone
      1 __wt_tree_walk,__evict_walk_file,__evict_walk,__evict_lru_walk,__evict_pass,__evict_server,start_thread,clone
      1 __wt_evict_lru_page,__evict_lru_pages,__evict_worker,start_thread,clone
      1 sigwait,mongo::(anonymous,boost::(anonymous,start_thread,clone
      1 select,mongo::Listener::initAndListen,_initAndListen,mongo::initAndListen,mongoDbMain,main
      1 recv,mongo::Socket::_recv,mongo::Socket::unsafe_recv,mongo::Socket::recv,mongo::MessagingPort::recv,mongo::PortMessageServer::handleIncomingMsg,start_thread,clone
      1 pthread_cond_wait@@GLIBC_2.3.2,wait<boost::unique_lock<boost::timed_mutex>,mongo::DeadlineMonitor<mongo::V8Scope>::deadlineMonitorThread,boost::(anonymous,start_thread,clone
      1 pthread_cond_timedwait@@GLIBC_2.3.2,__wt_cond_wait,__wt_cache_wait,__wt_cache_full_check,__wt_page_in_func,__wt_page_swap_func,__wt_row_search,__cursor_row_search,__wt_btcur_search,__curfile_search,mongo::WiredTigerRecordStore::dataFor,mongo::Collection::docFor,mongo::FetchStage::work,mongo::ProjectionStage::work,mongo::MultiPlanStage::workAllPlans,mongo::MultiPlanStage::pickBestPlan,mongo::SubplanStage::choosePlanForSubqueries,mongo::SubplanStage::pickBestPlan,mongo::PlanExecutor::pickBestPlan,mongo::PlanExecutor::make,mongo::PlanExecutor::make,mongo::getExecutor,mongo::runQuery,receivedQuery,mongo::assembleResponse,mongo::MyMessageHandler::process,mongo::PortMessageServer::handleIncomingMsg,start_thread,clone
      1 pthread_cond_timedwait@@GLIBC_2.3.2,__wt_cond_wait,__sweep_server,start_thread,clone
      1 pthread_cond_timedwait@@GLIBC_2.3.2,__wt_cond_wait,__log_close_server,start_thread,clone
      1 pthread_cond_timedwait@@GLIBC_2.3.2,timed_wait<boost::unique_lock<boost::timed_mutex>,timed_wait<boost::unique_lock<boost::timed_mutex>,,mongo::RangeDeleter::doWork,boost::(anonymous,start_thread,clone
      1 pthread_cond_timedwait@@GLIBC_2.3.2,boost::condition_variable_any::timed_wait<boost::unique_lock<boost::timed_mutex>,timed_wait<boost::unique_lock<boost::timed_mutex>,,timed_wait<boost::unique_lock<boost::timed_mutex>,,mongo::(anonymous,mongo::BackgroundJob::jobBody,boost::(anonymous,start_thread,clone
      1 nanosleep,mongo::sleepsecs,mongo::TTLMonitor::run,mongo::BackgroundJob::jobBody,boost::(anonymous,start_thread,clone
      1 nanosleep,mongo::sleepsecs,mongo::ClientCursorMonitor::run,mongo::BackgroundJob::jobBody,boost::(anonymous,start_thread,clone
      1 longest_match,deflate_slow,deflate,zlib_compress_raw,__rec_split_raw_worker,__rec_split_raw,__rec_row_leaf_insert,__rec_row_leaf,__wt_reconcile,__sync_file,__wt_cache_op,__checkpoint_worker,__wt_checkpoint,__checkpoint_apply,__wt_txn_checkpoint,__session_checkpoint,__ckpt_server,start_thread,clone
      1 __lll_lock_wait,pthread_mutex_lock,??,??,??

Comment by Mark Callaghan [ 18/Feb/15 ]

Changed client to survive exceptions in the query thread

Comment by Mark Callaghan [ 18/Feb/15 ]

Restarted test after enhancing the query thread code in jmongoiibench.java to survive exceptions. Maybe not fixed yet but queries stopped while inserted continued and then the exceptions occurs because the timeout is 10 minutes. Alas, I did not get a thread stack during that time.

But after this occurs I think the queries are still in progress and here is a stack:

      1 select,__wt_sleep,__wt_page_in_func,__wt_page_swap_func,__wt_row_search,__cursor_row_search,__wt_btcur_search,__curfile_search,mongo::WiredTigerRecordStore::dataFor,mongo::Collection::docFor,mongo::FetchStage::work,mongo::ProjectionStage::work,mongo::MultiPlanStage::workAllPlans,mongo::MultiPlanStage::pickBestPlan,mongo::SubplanStage::choosePlanForSubqueries,mongo::SubplanStage::pickBestPlan,mongo::PlanExecutor::pickBestPlan,mongo::PlanExecutor::make,mongo::PlanExecutor::make,mongo::getExecutor,mongo::runQuery,receivedQuery,mongo::assembleResponse,mongo::MyMessageHandler::process,mongo::PortMessageServer::handleIncomingMsg,start_thread,clone

Thread[Thread-2,5,main]10,120,188 inserts : 370 seconds : cum ips=27,349.57 : int ips=24,218.66 : cum avg qry=1.94 : int avg qry=1.36 : cum avg qps=514.82 : int avg qps=733.29 : exceptions=0
Thread[Thread-2,5,main]10,357,819 inserts : 380 seconds : cum ips=27,254.48 : int ips=23,739.36 : cum avg qry=1.92 : int avg qry=1.33 : cum avg qps=521.11 : int avg qps=748.45 : exceptions=0
Thread[Thread-2,5,main]10,630,438 inserts : 390 seconds : cum ips=27,253.90 : int ips=27,231.94 : cum avg qry=1.89 : int avg qry=1.21 : cum avg qps=529.00 : int avg qps=822.50 : exceptions=0
Thread[Thread-2,5,main]10,959,904 inserts : 400 seconds : cum ips=27,395.51 : int ips=32,913.69 : cum avg qry=1.87 : int avg qry=1.36 : cum avg qps=534.10 : int avg qps=729.07 : exceptions=0
Thread[Thread-2,5,main]11,323,256 inserts : 410 seconds : cum ips=27,612.58 : int ips=36,284.40 : cum avg qry=1.84 : int avg qry=1.09 : cum avg qps=543.70 : int avg qps=919.61 : exceptions=0
Thread[Thread-2,5,main]11,658,675 inserts : 420 seconds : cum ips=27,752.74 : int ips=33,491.66 : cum avg qry=1.84 : int avg qry=0.00 : cum avg qps=530.49 : int avg qps=0.00 : exceptions=0
Thread[Thread-2,5,main]11,990,546 inserts : 430 seconds : cum ips=27,884.80 : int ips=33,481.74 : cum avg qry=1.84 : int avg qry=0.00 : cum avg qps=518.04 : int avg qps=0.00 : exceptions=0
Thread[Thread-2,5,main]12,330,182 inserts : 440 seconds : cum ips=28,022.31 : int ips=33,929.67 : cum avg qry=1.84 : int avg qry=0.00 : cum avg qps=506.04 : int avg qps=0.00 : exceptions=0
Thread[Thread-2,5,main]12,663,933 inserts : 450 seconds : cum ips=28,140.64 : int ips=33,341.76 : cum avg qry=1.84 : int avg qry=0.00 : cum avg qps=494.59 : int avg qps=0.00 : exceptions=0
Thread[Thread-2,5,main]13,001,875 inserts : 460 seconds : cum ips=28,262.92 : int ips=33,760.44 : cum avg qry=1.84 : int avg qry=0.00 : cum avg qps=483.64 : int avg qps=0.00 : exceptions=0
Thread[Thread-2,5,main]13,336,393 inserts : 470 seconds : cum ips=28,372.71 : int ips=33,418.38 : cum avg qry=1.84 : int avg qry=0.00 : cum avg qps=473.17 : int avg qps=0.00 : exceptions=0
Thread[Thread-2,5,main]13,676,733 inserts : 480 seconds : cum ips=28,490.11 : int ips=34,003.40 : cum avg qry=1.84 : int avg qry=0.00 : cum avg qps=463.14 : int avg qps=0.00 : exceptions=0
Thread[Thread-2,5,main]14,018,052 inserts : 490 seconds : cum ips=28,604.59 : int ips=34,094.40 : cum avg qry=1.84 : int avg qry=0.00 : cum avg qps=453.53 : int avg qps=0.00 : exceptions=0
Thread[Thread-2,5,main]14,359,875 inserts : 500 seconds : cum ips=28,715.61 : int ips=34,151.56 : cum avg qry=1.84 : int avg qry=0.00 : cum avg qps=444.31 : int avg qps=0.00 : exceptions=0
Thread[Thread-2,5,main]14,699,392 inserts : 510 seconds : cum ips=28,817.65 : int ips=33,914.39 : cum avg qry=1.84 : int avg qry=0.00 : cum avg qps=435.46 : int avg qps=0.00 : exceptions=0
Thread[Thread-2,5,main]15,040,812 inserts : 520 seconds : cum ips=28,919.52 : int ips=34,111.30 : cum avg qry=1.84 : int avg qry=0.00 : cum avg qps=426.95 : int avg qps=0.00 : exceptions=0
Thread[Thread-2,5,main]15,377,435 inserts : 530 seconds : cum ips=29,013.97 : int ips=33,971.44 : cum avg qry=1.84 : int avg qry=0.00 : cum avg qps=418.85 : int avg qps=0.00 : exceptions=0
Thread[Thread-2,5,main]15,723,912 inserts : 540 seconds : cum ips=29,117.65 : int ips=34,606.17 : cum avg qry=1.84 : int avg qry=0.00 : cum avg qps=410.97 : int avg qps=0.00 : exceptions=0
Thread[Thread-2,5,main]16,066,390 inserts : 550 seconds : cum ips=29,210.40 : int ips=34,213.59 : cum avg qry=1.84 : int avg qry=0.00 : cum avg qps=403.38 : int avg qps=0.00 : exceptions=0
Thread[Thread-2,5,main]16,405,927 inserts : 560 seconds : cum ips=29,294.57 : int ips=33,919.78 : cum avg qry=1.84 : int avg qry=0.00 : cum avg qps=396.07 : int avg qps=0.00 : exceptions=0
Thread[Thread-2,5,main]16,749,783 inserts : 570 seconds : cum ips=29,383.37 : int ips=34,351.25 : cum avg qry=1.84 : int avg qry=0.00 : cum avg qps=389.02 : int avg qps=0.00 : exceptions=0
Thread[Thread-2,5,main]17,091,061 inserts : 580 seconds : cum ips=29,464.70 : int ips=34,097.11 : cum avg qry=1.84 : int avg qry=0.00 : cum avg qps=382.22 : int avg qps=0.00 : exceptions=0
Thread[Thread-2,5,main]17,434,325 inserts : 590 seconds : cum ips=29,546.60 : int ips=34,292.11 : cum avg qry=1.84 : int avg qry=0.00 : cum avg qps=375.65 : int avg qps=0.00 : exceptions=0
Thread[Thread-2,5,main]17,779,673 inserts : 600 seconds : cum ips=29,629.28 : int ips=34,503.75 : cum avg qry=1.84 : int avg qry=0.00 : cum avg qps=369.30 : int avg qps=0.00 : exceptions=0
Thread[Thread-2,5,main]18,125,615 inserts : 610 seconds : cum ips=29,710.13 : int ips=34,556.19 : cum avg qry=1.84 : int avg qry=0.00 : cum avg qps=363.16 : int avg qps=0.00 : exceptions=0
Thread[Thread-2,5,main]18,466,384 inserts : 620 seconds : cum ips=29,780.12 : int ips=34,046.26 : cum avg qry=1.84 : int avg qry=0.00 : cum avg qps=357.23 : int avg qps=0.00 : exceptions=0
Thread[Thread-2,5,main]18,804,717 inserts : 630 seconds : cum ips=29,848.66 : int ips=34,137.12 : cum avg qry=1.84 : int avg qry=0.00 : cum avg qps=351.54 : int avg qps=0.00 : exceptions=0
Thread[Thread-2,5,main]19,147,197 inserts : 640 seconds : cum ips=29,916.89 : int ips=34,210.37 : cum avg qry=1.84 : int avg qry=0.00 : cum avg qps=345.97 : int avg qps=0.00 : exceptions=0
Thread[Thread-2,5,main]19,488,574 inserts : 650 seconds : cum ips=29,981.36 : int ips=34,103.60 : cum avg qry=1.84 : int avg qry=0.00 : cum avg qps=340.58 : int avg qps=0.00 : exceptions=0
Thread[Thread-2,5,main]19,832,846 inserts : 660 seconds : cum ips=30,048.17 : int ips=34,385.94 : cum avg qry=1.84 : int avg qry=0.00 : cum avg qps=335.35 : int avg qps=0.00 : exceptions=0
Thread[Thread-2,5,main]20,178,837 inserts : 670 seconds : cum ips=30,115.69 : int ips=34,567.99 : cum avg qry=1.84 : int avg qry=0.00 : cum avg qps=330.28 : int avg qps=0.00 : exceptions=0
Thread[Thread-2,5,main]20,524,728 inserts : 680 seconds : cum ips=30,181.07 : int ips=34,558.00 : cum avg qry=1.84 : int avg qry=0.00 : cum avg qps=325.37 : int avg qps=0.00 : exceptions=0
Thread[Thread-2,5,main]20,866,496 inserts : 690 seconds : cum ips=30,238.63 : int ips=34,149.48 : cum avg qry=1.84 : int avg qry=0.00 : cum avg qps=320.59 : int avg qps=0.00 : exceptions=0
Thread[Thread-2,5,main]21,206,852 inserts : 700 seconds : cum ips=30,292.43 : int ips=34,001.60 : cum avg qry=1.84 : int avg qry=0.00 : cum avg qps=315.96 : int avg qps=0.00 : exceptions=0
Thread[Thread-2,5,main]21,550,101 inserts : 710 seconds : cum ips=30,348.75 : int ips=34,287.18 : cum avg qry=1.84 : int avg qry=0.00 : cum avg qps=311.45 : int avg qps=0.00 : exceptions=0
Thread[Thread-2,5,main]21,884,651 inserts : 720 seconds : cum ips=30,391.51 : int ips=33,424.92 : cum avg qry=1.84 : int avg qry=0.00 : cum avg qps=307.08 : int avg qps=0.00 : exceptions=0
Thread[Thread-2,5,main]22,220,905 inserts : 730 seconds : cum ips=30,439.55 : int ips=33,930.78 : cum avg qry=1.84 : int avg qry=0.00 : cum avg qps=302.86 : int avg qps=0.00 : exceptions=0
Thread[Thread-2,5,main]22,554,870 inserts : 740 seconds : cum ips=30,479.02 : int ips=33,356.47 : cum avg qry=1.84 : int avg qry=0.00 : cum avg qps=298.72 : int avg qps=0.00 : exceptions=0
Thread[Thread-2,5,main]22,894,923 inserts : 750 seconds : cum ips=30,525.67 : int ips=33,974.72 : cum avg qry=1.84 : int avg qry=0.00 : cum avg qps=294.70 : int avg qps=0.00 : exceptions=0
Thread[Thread-2,5,main]23,233,159 inserts : 760 seconds : cum ips=30,568.70 : int ips=33,793.19 : cum avg qry=1.84 : int avg qry=0.00 : cum avg qps=290.77 : int avg qps=0.00 : exceptions=0
Thread[Thread-2,5,main]23,571,247 inserts : 770 seconds : cum ips=30,610.38 : int ips=33,775.02 : cum avg qry=1.84 : int avg qry=0.00 : cum avg qps=286.96 : int avg qps=0.00 : exceptions=0
Thread[Thread-2,5,main]23,907,336 inserts : 780 seconds : cum ips=30,648.35 : int ips=33,568.62 : cum avg qry=1.84 : int avg qry=0.00 : cum avg qps=283.24 : int avg qps=0.00 : exceptions=0
Thread[Thread-2,5,main]24,245,431 inserts : 790 seconds : cum ips=30,688.01 : int ips=33,779.10 : cum avg qry=1.84 : int avg qry=0.00 : cum avg qps=279.61 : int avg qps=0.00 : exceptions=0
Thread[Thread-2,5,main]24,584,620 inserts : 800 seconds : cum ips=30,728.05 : int ips=33,888.40 : cum avg qry=1.84 : int avg qry=0.00 : cum avg qps=276.08 : int avg qps=0.00 : exceptions=0
Thread[Thread-2,5,main]24,949,241 inserts : 810 seconds : cum ips=30,798.45 : int ips=36,425.67 : cum avg qry=1.84 : int avg qry=0.00 : cum avg qps=272.64 : int avg qps=0.00 : exceptions=0
Thread[Thread-2,5,main]25,306,381 inserts : 820 seconds : cum ips=30,858.05 : int ips=35,681.89 : cum avg qry=1.84 : int avg qry=0.00 : cum avg qps=269.28 : int avg qps=0.00 : exceptions=0
Thread[Thread-2,5,main]25,671,578 inserts : 830 seconds : cum ips=30,929.57 : int ips=36,847.64 : cum avg qry=1.84 : int avg qry=0.00 : cum avg qps=266.03 : int avg qps=0.00 : exceptions=0
Thread[Thread-2,5,main]26,039,339 inserts : 840 seconds : cum ips=30,998.77 : int ips=36,735.69 : cum avg qry=1.84 : int avg qry=0.00 : cum avg qps=262.83 : int avg qps=0.00 : exceptions=0
Thread[Thread-2,5,main]26,403,993 inserts : 850 seconds : cum ips=31,062.75 : int ips=36,432.61 : cum avg qry=1.84 : int avg qry=0.00 : cum avg qps=259.71 : int avg qps=0.00 : exceptions=0
Thread[Thread-2,5,main]26,765,384 inserts : 860 seconds : cum ips=31,121.42 : int ips=36,103.00 : cum avg qry=1.84 : int avg qry=0.00 : cum avg qps=256.66 : int avg qps=0.00 : exceptions=0
Thread[Thread-2,5,main]27,130,136 inserts : 870 seconds : cum ips=31,182.59 : int ips=36,438.76 : cum avg qry=1.84 : int avg qry=0.00 : cum avg qps=253.68 : int avg qps=0.00 : exceptions=0
Thread[Thread-2,5,main]27,485,462 inserts : 880 seconds : cum ips=31,231.71 : int ips=35,500.65 : cum avg qry=1.84 : int avg qry=0.00 : cum avg qps=250.77 : int avg qps=0.00 : exceptions=0
Thread[Thread-2,5,main]27,848,494 inserts : 890 seconds : cum ips=31,288.37 : int ips=36,270.56 : cum avg qry=1.84 : int avg qry=0.00 : cum avg qps=247.92 : int avg qps=0.00 : exceptions=0
Thread[Thread-2,5,main]28,212,501 inserts : 900 seconds : cum ips=31,344.75 : int ips=36,357.07 : cum avg qry=1.84 : int avg qry=0.00 : cum avg qps=245.14 : int avg qps=0.00 : exceptions=0
Thread[Thread-2,5,main]28,581,093 inserts : 910 seconds : cum ips=31,405.00 : int ips=36,822.38 : cum avg qry=1.84 : int avg qry=0.00 : cum avg qps=242.42 : int avg qps=0.00 : exceptions=0
Thread[Thread-2,5,main]28,944,839 inserts : 920 seconds : cum ips=31,458.64 : int ips=36,334.63 : cum avg qry=1.84 : int avg qry=0.00 : cum avg qps=239.76 : int avg qps=0.00 : exceptions=0
Thread[Thread-2,5,main]29,303,894 inserts : 930 seconds : cum ips=31,509.50 : int ips=36,231.58 : cum avg qry=1.84 : int avg qry=0.00 : cum avg qps=237.18 : int avg qps=0.00 : exceptions=0
Thread[Thread-2,5,main]29,670,143 inserts : 940 seconds : cum ips=31,563.61 : int ips=36,591.97 : cum avg qry=1.84 : int avg qry=0.00 : cum avg qps=234.64 : int avg qps=0.00 : exceptions=0
Thread[Thread-2,5,main]30,037,271 inserts : 950 seconds : cum ips=31,617.45 : int ips=36,672.46 : cum avg qry=1.84 : int avg qry=0.00 : cum avg qps=232.15 : int avg qps=0.00 : exceptions=0
Thread[Thread-2,5,main]30,398,023 inserts : 960 seconds : cum ips=31,663.52 : int ips=36,035.56 : cum avg qry=1.84 : int avg qry=0.00 : cum avg qps=229.70 : int avg qps=0.00 : exceptions=0
Thread[Thread-2,5,main]30,765,447 inserts : 970 seconds : cum ips=31,715.55 : int ips=36,705.69 : cum avg qry=1.84 : int avg qry=0.00 : cum avg qps=227.32 : int avg qps=0.00 : exceptions=0
Thread[Thread-2,5,main]31,127,250 inserts : 980 seconds : cum ips=31,760.81 : int ips=36,147.77 : cum avg qry=1.84 : int avg qry=0.00 : cum avg qps=224.98 : int avg qps=0.00 : exceptions=0
Thread[Thread-2,5,main]31,490,285 inserts : 990 seconds : cum ips=31,806.41 : int ips=36,270.86 : cum avg qry=1.84 : int avg qry=0.00 : cum avg qps=222.68 : int avg qps=0.00 : exceptions=0
Thread[Thread-2,5,main]31,850,298 inserts : 1,000 seconds : cum ips=31,848.07 : int ips=35,968.93 : cum avg qry=1.84 : int avg qry=0.00 : cum avg qps=220.44 : int avg qps=0.00 : exceptions=0
Thread[Thread-2,5,main]32,213,027 inserts : 1,010 seconds : cum ips=31,891.56 : int ips=36,236.66 : cum avg qry=1.84 : int avg qry=0.00 : cum avg qps=218.24 : int avg qps=0.00 : exceptions=0
Thread[Thread-13,5,main]Query thread 0 : EXCEPTION
com.mongodb.MongoException$Network: Read operation to server localhost:27017 failed on database iibench
        at com.mongodb.DBTCPConnector.innerCall(DBTCPConnector.java:300)
        at com.mongodb.DBTCPConnector.call(DBTCPConnector.java:271)
        at com.mongodb.DBCollectionImpl.find(DBCollectionImpl.java:84)
        at com.mongodb.DBCollectionImpl.find(DBCollectionImpl.java:66)
        at com.mongodb.DBCursor._check(DBCursor.java:458)
        at com.mongodb.DBCursor._hasNext(DBCursor.java:546)
        at com.mongodb.DBCursor.hasNext(DBCursor.java:571)
        at jmongoiibench$MyQuery.run(jmongoiibench.java:582)
        at java.lang.Thread.run(Thread.java:745)
Caused by: java.net.SocketTimeoutException: Read timed out
        at java.net.SocketInputStream.socketRead0(Native Method)
        at java.net.SocketInputStream.read(SocketInputStream.java:150)
        at java.net.SocketInputStream.read(SocketInputStream.java:121)
        at java.io.BufferedInputStream.fill(BufferedInputStream.java:246)
        at java.io.BufferedInputStream.read1(BufferedInputStream.java:286)
        at java.io.BufferedInputStream.read(BufferedInputStream.java:345)
        at org.bson.io.Bits.readFully(Bits.java:48)
        at org.bson.io.Bits.readFully(Bits.java:35)
        at org.bson.io.Bits.readFully(Bits.java:30)
        at com.mongodb.Response.<init>(Response.java:42)
        at com.mongodb.DBPort$1.execute(DBPort.java:141)

And now queries and inserts have stopped

Thread[Thread-2,5,main]34,564,066 inserts : 1,080 seconds : cum ips=32,002.34 : int ips=32,825.77 : cum avg qry=4.58 : int avg qry=0.00 : cum avg qps=204.00 : int avg qps=0.00 : exceptions=0
Thread[Thread-2,5,main]34,894,819 inserts : 1,090 seconds : cum ips=32,011.92 : int ips=33,045.56 : cum avg qry=4.58 : int avg qry=0.00 : cum avg qps=202.11 : int avg qps=0.00 : exceptions=0
Thread[Thread-2,5,main]35,228,018 inserts : 1,100 seconds : cum ips=32,023.49 : int ips=33,283.29 : cum avg qry=4.58 : int avg qry=0.00 : cum avg qps=200.26 : int avg qps=0.00 : exceptions=0
Thread[Thread-2,5,main]35,555,308 inserts : 1,110 seconds : cum ips=32,029.59 : int ips=32,699.57 : cum avg qry=4.58 : int avg qry=0.00 : cum avg qps=198.44 : int avg qps=0.00 : exceptions=0
Thread[Thread-2,5,main]35,884,490 inserts : 1,120 seconds : cum ips=32,037.21 : int ips=32,882.03 : cum avg qry=4.58 : int avg qry=0.00 : cum avg qps=196.65 : int avg qps=0.00 : exceptions=0
Thread[Thread-2,5,main]36,103,365 inserts : 1,130 seconds : cum ips=31,947.14 : int ips=21,867.82 : cum avg qry=4.58 : int avg qry=0.00 : cum avg qps=194.90 : int avg qps=0.00 : exceptions=0
Thread[Thread-2,5,main]36,238,358 inserts : 1,140 seconds : cum ips=31,787.87 : int ips=13,623.27 : cum avg qry=4.58 : int avg qry=0.00 : cum avg qps=193.19 : int avg qps=0.00 : exceptions=0
Thread[Thread-2,5,main]36,301,395 inserts : 1,150 seconds : cum ips=31,566.05 : int ips=6,298.66 : cum avg qry=4.58 : int avg qry=0.00 : cum avg qps=191.50 : int avg qps=0.00 : exceptions=0
Thread[Thread-2,5,main]36,301,395 inserts : 1,160 seconds : cum ips=31,293.71 : int ips=0.00 : cum avg qry=4.58 : int avg qry=0.00 : cum avg qps=189.84 : int avg qps=0.00 : exceptions=0
Thread[Thread-2,5,main]36,301,395 inserts : 1,170 seconds : cum ips=31,026.04 : int ips=0.00 : cum avg qry=4.58 : int avg qry=0.00 : cum avg qps=188.20 : int avg qps=0.00 : exceptions=0
Thread[Thread-2,5,main]36,301,395 inserts : 1,180 seconds : cum ips=30,762.90 : int ips=0.00 : cum avg qry=4.58 : int avg qry=0.00 : cum avg qps=186.60 : int avg qps=0.00 : exceptions=0
Thread[Thread-2,5,main]36,301,396 inserts : 1,190 seconds : cum ips=30,504.20 : int ips=0.10 : cum avg qry=4.58 : int avg qry=0.00 : cum avg qps=185.02 : int avg qps=0.00 : exceptions=0
Thread[Thread-2,5,main]36,301,397 inserts : 1,200 seconds : cum ips=30,249.80 : int ips=0.10 : cum avg qry=4.58 : int avg qry=0.00 : cum avg qps=183.46 : int avg qps=0.00 : exceptions=0
Thread[Thread-2,5,main]36,301,397 inserts : 1,210 seconds : cum ips=29,999.62 : int ips=0.00 : cum avg qry=4.58 : int avg qry=0.00 : cum avg qps=181.94 : int avg qps=0.00 : exceptions=0
Thread[Thread-2,5,main]36,301,397 inserts : 1,220 seconds : cum ips=29,753.54 : int ips=0.00 : cum avg qry=4.58 : int avg qry=0.00 : cum avg qps=180.44 : int avg qps=0.00 : exceptions=0
Thread[Thread-2,5,main]36,301,397 inserts : 1,230 seconds : cum ips=29,511.48 : int ips=0.00 : cum avg qry=4.58 : int avg qry=0.00 : cum avg qps=178.96 : int avg qps=0.00 : exceptions=0
Thread[Thread-2,5,main]36,301,397 inserts : 1,240 seconds : cum ips=29,273.31 : int ips=0.00 : cum avg qry=4.58 : int avg qry=0.00 : cum avg qps=177.50 : int avg qps=0.00 : exceptions=0
Thread[Thread-2,5,main]36,301,397 inserts : 1,250 seconds : cum ips=29,038.93 : int ips=0.00 : cum avg qry=4.58 : int avg qry=0.00 : cum avg qps=176.07 : int avg qps=0.00 : exceptions=0
Thread[Thread-2,5,main]36,301,397 inserts : 1,260 seconds : cum ips=28,810.61 : int ips=0.00 : cum avg qry=4.58 : int avg qry=0.00 : cum avg qps=174.68 : int avg qps=0.00 : exceptions=0
Thread[Thread-2,5,main]36,301,397 inserts : 1,270 seconds : cum ips=28,583.57 : int ips=0.00 : cum avg qry=4.58 : int avg qry=0.00 : cum avg qps=173.30 : int avg qps=0.00 : exceptions=0
Thread[Thread-2,5,main]36,301,397 inserts : 1,280 seconds : cum ips=28,360.09 : int ips=0.00 : cum avg qry=4.58 : int avg qry=0.00 : cum avg qps=171.93 : int avg qps=0.00 : exceptions=0
Thread[Thread-2,5,main]36,301,397 inserts : 1,290 seconds : cum ips=28,140.07 : int ips=0.00 : cum avg qry=4.58 : int avg qry=0.00 : cum avg qps=170.59 : int avg qps=0.00 : exceptions=0
Thread[Thread-2,5,main]36,301,397 inserts : 1,300 seconds : cum ips=27,923.42 : int ips=0.00 : cum avg qry=4.58 : int avg qry=0.00 : cum avg qps=169.27 : int avg qps=0.00 : exceptions=0
Thread[Thread-2,5,main]36,301,397 inserts : 1,310 seconds : cum ips=27,710.10 : int ips=0.00 : cum avg qry=4.58 : int avg qry=0.00 : cum avg qps=167.97 : int avg qps=0.00 : exceptions=0
Thread[Thread-2,5,main]36,301,397 inserts : 1,320 seconds : cum ips=27,500.02 : int ips=0.00 : cum avg qry=4.58 : int avg qry=0.00 : cum avg qps=166.69 : int avg qps=0.00 : exceptions=0
Thread[Thread-2,5,main]36,301,397 inserts : 1,330 seconds : cum ips=27,293.07 : int ips=0.00 : cum avg qry=4.58 : int avg qry=0.00 : cum avg qps=165.43 : int avg qps=0.00 : exceptions=0
Thread[Thread-2,5,main]36,301,397 inserts : 1,340 seconds : cum ips=27,089.26 : int ips=0.00 : cum avg qry=4.58 : int avg qry=0.00 : cum avg qps=164.18 : int avg qps=0.00 : exceptions=0
Thread[Thread-2,5,main]36,301,397 inserts : 1,350 seconds : cum ips=26,888.45 : int ips=0.00 : cum avg qry=4.58 : int avg qry=0.00 : cum avg qps=162.96 : int avg qps=0.00 : exceptions=0
Thread[Thread-2,5,main]36,301,397 inserts : 1,360 seconds : cum ips=26,690.59 : int ips=0.00 : cum avg qry=4.58 : int avg qry=0.00 : cum avg qps=161.75 : int avg qps=0.00 : exceptions=0
Thread[Thread-2,5,main]36,301,397 inserts : 1,370 seconds : cum ips=26,495.63 : int ips=0.00 : cum avg qry=4.58 : int avg qry=0.00 : cum avg qps=160.57 : int avg qps=0.00 : exceptions=0

And thread stacks again

      8 pthread_cond_timedwait@@GLIBC_2.3.2,__wt_cond_wait,__wt_cache_wait,__wt_cache_full_check,__cursor_enter,__curfile_enter,__cursor_func_init,__wt_btcur_insert,__curfile_insert,mongo::WiredTigerIndexStandard::_insert,mongo::WiredTigerIndex::insert,mongo::BtreeBasedAccessMethod::insert,mongo::IndexCatalog::_indexRecord,mongo::IndexCatalog::indexRecord,mongo::Collection::_insertDocument,mongo::Collection::insertDocument,singleInsert,insertOne,mongo::WriteBatchExecutor::execOneInsert,mongo::WriteBatchExecutor::execInserts,mongo::WriteBatchExecutor::bulkExecute,mongo::WriteBatchExecutor::executeBatch,mongo::WriteCmd::run,mongo::_execCommand,mongo::Command::execCommand,mongo::_runCommands,runCommands,mongo::runQuery,receivedQuery,mongo::assembleResponse,mongo::MyMessageHandler::process,mongo::PortMessageServer::handleIncomingMsg,start_thread,clone
      2 pthread_cond_timedwait@@GLIBC_2.3.2,__wt_cond_wait,__wt_cache_wait,__wt_cache_full_check,__cursor_enter,__curfile_enter,__cursor_func_init,__wt_btcur_insert,__curfile_insert,mongo::WiredTigerRecordStore::insertRecord,mongo::Collection::_insertDocument,mongo::Collection::insertDocument,singleInsert,insertOne,mongo::WriteBatchExecutor::execOneInsert,mongo::WriteBatchExecutor::execInserts,mongo::WriteBatchExecutor::bulkExecute,mongo::WriteBatchExecutor::executeBatch,mongo::WriteCmd::run,mongo::_execCommand,mongo::Command::execCommand,mongo::_runCommands,runCommands,mongo::runQuery,receivedQuery,mongo::assembleResponse,mongo::MyMessageHandler::process,mongo::PortMessageServer::handleIncomingMsg,start_thread,clone
      2 __evict_has_work,__evict_worker,start_thread,clone
      1 __wt_row_modify,__split_multi_inmem,__wt_split_rewrite,__evict_page_dirty_update,__wt_evict,__wt_evict_page,__wt_page_release_evict,__wt_page_in_func,__wt_page_swap_func,__wt_row_search,__cursor_row_search,__wt_btcur_search,__curfile_search,mongo::WiredTigerRecordStore::dataFor,mongo::Collection::docFor,mongo::FetchStage::work,mongo::ProjectionStage::work,mongo::MultiPlanStage::workAllPlans,mongo::MultiPlanStage::pickBestPlan,mongo::SubplanStage::choosePlanForSubqueries,mongo::SubplanStage::pickBestPlan,mongo::PlanExecutor::pickBestPlan,mongo::PlanExecutor::make,mongo::PlanExecutor::make,mongo::getExecutor,mongo::runQuery,receivedQuery,mongo::assembleResponse,mongo::MyMessageHandler::process,mongo::PortMessageServer::handleIncomingMsg,start_thread,clone
      1 sigwait,mongo::(anonymous,boost::(anonymous,start_thread,clone
      1 select,__wt_sleep,__wt_page_in_func,__wt_page_swap_func,__wt_row_search,__cursor_row_search,__wt_btcur_search,__curfile_search,mongo::WiredTigerRecordStore::dataFor,mongo::Collection::docFor,mongo::FetchStage::work,mongo::ProjectionStage::work,mongo::MultiPlanStage::workAllPlans,mongo::MultiPlanStage::pickBestPlan,mongo::SubplanStage::choosePlanForSubqueries,mongo::SubplanStage::pickBestPlan,mongo::PlanExecutor::pickBestPlan,mongo::PlanExecutor::make,mongo::PlanExecutor::make,mongo::getExecutor,mongo::runQuery,receivedQuery,mongo::assembleResponse,mongo::MyMessageHandler::process,mongo::PortMessageServer::handleIncomingMsg,start_thread,clone
      1 select,__wt_sleep,__evict_pass,__evict_server,start_thread,clone
      1 select,mongo::Listener::initAndListen,_initAndListen,mongo::initAndListen,mongoDbMain,main
      1 recv,mongo::Socket::_recv,mongo::Socket::unsafe_recv,mongo::Socket::recv,mongo::MessagingPort::recv,mongo::PortMessageServer::handleIncomingMsg,start_thread,clone
      1 pthread_cond_wait@@GLIBC_2.3.2,wait<boost::unique_lock<boost::timed_mutex>,mongo::DeadlineMonitor<mongo::V8Scope>::deadlineMonitorThread,boost::(anonymous,start_thread,clone
      1 pthread_cond_timedwait@@GLIBC_2.3.2,__wt_cond_wait,__wt_cache_wait,__wt_cache_full_check,__cursor_enter,__curfile_enter,__cursor_func_init,__wt_btcur_search,__curfile_search,mongo::WiredTigerRecordStore::findRecord,mongo::KVCatalog::_findEntry,mongo::KVCatalog::getMetaData,mongo::KVCollectionCatalogEntry::_getMetaData,mongo::BSONCollectionCatalogEntry::getAllIndexes,mongo::TTLMonitor::getTTLIndexesForDB,mongo::TTLMonitor::doTTLPass,mongo::TTLMonitor::run,mongo::BackgroundJob::jobBody,boost::(anonymous,start_thread,clone
      1 pthread_cond_timedwait@@GLIBC_2.3.2,__wt_cond_wait,__sweep_server,start_thread,clone
      1 pthread_cond_timedwait@@GLIBC_2.3.2,__wt_cond_wait,__log_server,start_thread,clone
      1 pthread_cond_timedwait@@GLIBC_2.3.2,__wt_cond_wait,__log_close_server,start_thread,clone
      1 pthread_cond_timedwait@@GLIBC_2.3.2,timed_wait<boost::unique_lock<boost::timed_mutex>,timed_wait<boost::unique_lock<boost::timed_mutex>,,mongo::RangeDeleter::doWork,boost::(anonymous,start_thread,clone
      1 pthread_cond_timedwait@@GLIBC_2.3.2,boost::condition_variable_any::timed_wait<boost::unique_lock<boost::timed_mutex>,timed_wait<boost::unique_lock<boost::timed_mutex>,,timed_wait<boost::unique_lock<boost::timed_mutex>,,mongo::(anonymous,mongo::BackgroundJob::jobBody,boost::(anonymous,start_thread,clone
      1 nanosleep,mongo::sleepsecs,mongo::ClientCursorMonitor::run,mongo::BackgroundJob::jobBody,boost::(anonymous,start_thread,clone
      1 fill_window,deflate_slow,deflate,zlib_compress_raw,__rec_split_raw_worker,__rec_split_raw,__rec_row_leaf_insert,__rec_row_leaf,__wt_reconcile,__sync_file,__wt_cache_op,__checkpoint_write_leaves,__checkpoint_apply,__wt_txn_checkpoint,__session_checkpoint,__ckpt_server,start_thread,clone
      1 __evict_lru_pages,__evict_worker,start_thread,clone

Comment by Mark Callaghan [ 18/Feb/15 ]

And eventually inserts return but not queries. I don't think the iibench-mongo client is reconnecting for the query thread. But from the thread stacks posted previously, the writers were definitely stalled in WiredTiger code

Thread[Thread-2,5,main]51,016,813 inserts : 4,140 seconds : cum ips=12,322.79 : int ips=0.00 : cum avg qry=2.11 : int avg qry=0.00 : cum avg qps=137.83 : int avg qps=0.00 : exceptions=30
Thread[Thread-2,5,main]51,016,813 inserts : 4,150 seconds : cum ips=12,293.08 : int ips=0.00 : cum avg qry=2.11 : int avg qry=0.00 : cum avg qps=137.50 : int avg qps=0.00 : exceptions=30
Thread[Thread-2,5,main]51,016,813 inserts : 4,160 seconds : cum ips=12,263.51 : int ips=0.00 : cum avg qry=2.11 : int avg qry=0.00 : cum avg qps=137.16 : int avg qps=0.00 : exceptions=30
Thread[Thread-2,5,main]51,016,813 inserts : 4,170 seconds : cum ips=12,234.07 : int ips=0.00 : cum avg qry=2.11 : int avg qry=0.00 : cum avg qps=136.83 : int avg qps=0.00 : exceptions=30
Thread[Thread-2,5,main]51,016,813 inserts : 4,180 seconds : cum ips=12,204.79 : int ips=0.00 : cum avg qry=2.11 : int avg qry=0.00 : cum avg qps=136.51 : int avg qps=0.00 : exceptions=30
Thread[Thread-2,5,main]51,016,813 inserts : 4,190 seconds : cum ips=12,175.64 : int ips=0.00 : cum avg qry=2.11 : int avg qry=0.00 : cum avg qps=136.18 : int avg qps=0.00 : exceptions=30
Thread[Thread-2,5,main]51,074,762 inserts : 4,200 seconds : cum ips=12,160.42 : int ips=5,790.27 : cum avg qry=2.11 : int avg qry=0.00 : cum avg qps=135.85 : int avg qps=0.00 : exceptions=30
Thread[Thread-2,5,main]51,426,513 inserts : 4,210 seconds : cum ips=12,215.06 : int ips=35,136.45 : cum avg qry=2.11 : int avg qry=0.00 : cum avg qps=135.53 : int avg qps=0.00 : exceptions=30
Thread[Thread-2,5,main]51,680,965 inserts : 4,220 seconds : cum ips=12,246.66 : int ips=25,671.11 : cum avg qry=2.11 : int avg qry=0.00 : cum avg qps=135.21 : int avg qps=0.00 : exceptions=30
Thread[Thread-2,5,main]52,001,587 inserts : 4,230 seconds : cum ips=12,293.48 : int ips=32,030.17 : cum avg qry=2.11 : int avg qry=0.00 : cum avg qps=134.89 : int avg qps=0.00 : exceptions=30
Thread[Thread-2,5,main]52,364,419 inserts : 4,240 seconds : cum ips=12,350.03 : int ips=36,243.33 : cum avg qry=2.11 : int avg qry=0.00 : cum avg qps=134.57 : int avg qps=0.00 : exceptions=30
Thread[Thread-2,5,main]52,632,717 inserts : 4,250 seconds : cum ips=12,384.07 : int ips=26,805.67 : cum avg qry=2.11 : int avg qry=0.00 : cum avg qps=134.25 : int avg qps=0.00 : exceptions=30
Thread[Thread-2,5,main]52,957,736 inserts : 4,260 seconds : cum ips=12,431.27 : int ips=32,469.43 : cum avg qry=2.11 : int avg qry=0.00 : cum avg qps=133.94 : int avg qps=0.00 : exceptions=30
Thread[Thread-2,5,main]53,323,767 inserts : 4,270 seconds : cum ips=12,487.84 : int ips=36,559.23 : cum avg qry=2.11 : int avg qry=0.00 : cum avg qps=133.62 : int avg qps=0.00 : exceptions=30
Thread[Thread-2,5,main]53,669,890 inserts : 4,280 seconds : cum ips=12,539.49 : int ips=34,570.82 : cum avg qry=2.11 : int avg qry=0.00 : cum avg qps=133.31 : int avg qps=0.00 : exceptions=30

Comment by Mark Callaghan [ 18/Feb/15 ]

Grep for TODO_SET_JAVA_PATH in run.simple.bash and edit it.

mongod command line is:
numactl --interleave=all bin/mongod --config /foo/mongo.conf --storageEngine wiredTiger --wiredTigerCacheSizeGB 70

Stalls happen after 50M docs inserted which is ~1800 seconds. Database was empty at test start.

o.s.1 has output from db.serverStatus() after test was running for ~3000 seconds. That command hung, but fortunately printed output first.

The server in 3.0 rc8 linked with jemalloc

Generated at Thu Feb 08 03:44:00 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.