Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-17021

Significant WT Performance Regression on "modified" FSM workload

    • Fully Compatible
    • Linux
    • Hide

      The findAndModify_update_grow.js FSM workload spawns several threads. Each thread inserts a single document into a collection and then, using the findAndModify command, attempts to trigger a document move by growing (doubling) the size of the inserted document using the $set and $mul update operators.

      The "modified" jstests/concurrency/fsm_workloads/findAndModify_update_grow.js uses 20 threads and 50 steps (iterations = 50) and attempts to quadruple the size of the document.

      My experiments were done on a rhel55-test spawnhost (8GB RAM and 8GB disk space).

      On mmapv1, running the attached fsm_all.js takes about 2 mins but for WT it kept running for 30+ mins (until I shut it down).

      WT invocation:

      python buildscripts/smoke.py --nopreallocj --with-cleanbb --mongod ~/mongodb-linux-64/bin/mongod --mongo ~/mongodb-linux-64/bin/mongo --continue-on-failure --storageEngine=wiredTiger --wiredTigerEngineConfig=cache_size=1G concurrency 
      
      Show
      The findAndModify_update_grow.js FSM workload spawns several threads. Each thread inserts a single document into a collection and then, using the findAndModify command, attempts to trigger a document move by growing (doubling) the size of the inserted document using the $set and $mul update operators. The "modified" jstests/concurrency/fsm_workloads/findAndModify_update_grow.js uses 20 threads and 50 steps (iterations = 50) and attempts to quadruple the size of the document. My experiments were done on a rhel55-test spawnhost (8GB RAM and 8GB disk space). On mmapv1, running the attached fsm_all.js takes about 2 mins but for WT it kept running for 30+ mins (until I shut it down). WT invocation: python buildscripts/smoke.py --nopreallocj --with-cleanbb --mongod ~/mongodb-linux-64/bin/mongod --mongo ~/mongodb-linux-64/bin/mongo -- continue -on-failure --storageEngine=wiredTiger --wiredTigerEngineConfig=cache_size=1G concurrency
    • 0

      A modified version of

      jstests/concurrency/fsm_workloads/findAndModify_update_grow.js

      (see attachments) seems to be stalling significantly on IO, specifically when waiting to read WT overflow items from disk.

      Here is an abridged version of a poor man's profiler output (see: http://poormansprofiler.org/). Full profiler output is attached.

      (# times this stack occurred, thread stack)
      41 pthread_cond_timedwait@@GLIBC_2.3.2,__wt_cond_wait,??,start_thread,clone
      39 __wt_readlock,__wt_ovfl_read,__wt_kv_return,__wt_btcur_search_near,??,mongo::WiredTigerRecordStore::Iterator::_locate(mongo::RecordId,mongo::WiredTigerRecordStore::Iterator::restoreState(mongo::OperationContext*),mongo::CollectionScan::restoreState(mongo::OperationContext*), [...]
      35 __wt_readlock,__wt_ovfl_read,__wt_btcur_next,??,mongo::WiredTigerRecordStore::Iterator::_getNext(),mongo::WiredTigerRecordStore::Iterator::restoreState(mongo::OperationContext*),mongo::CollectionScan::restoreState(mongo::OperationContext*),[...]
      30 __wt_readlock,__wt_ovfl_read,__wt_btcur_next,??,mongo::WiredTigerRecordStore::Iterator::_getNext(),mongo::WiredTigerRecordStore::Iterator::getNext(),mongo::CollectionScan::work(unsigned,mongo::SortStage::work(unsigned,mongo::PlanExecutor::getNext(mongo::BSONObj*,,mongo::CmdFindAndModify::runImpl(mongo::OperationContext*,,mongo::CmdFindAndModify::run(mongo::OperationContext*,[...]
      27 __wt_readlock,__wt_ovfl_read,__wt_kv_return,__wt_btcur_search_near,??,mongo::WiredTigerRecordStore::Iterator::_locate(mongo::RecordId,mongo::WiredTigerRecordStore::Iterator::restoreState(mongo::OperationContext*),mongo::CollectionScan::restoreState(mongo::OperationContext*),[...]
      23 __wt_readlock,__wt_ovfl_read,__wt_btcur_next,??,mongo::WiredTigerRecordStore::Iterator::_getNext(),mongo::WiredTigerRecordStore::Iterator::restoreState(mongo::OperationContext*),mongo::CollectionScan::restoreState(mongo::OperationContext*),[...]
      22 __wt_readlock,__wt_ovfl_read,__wt_btcur_next,??,mongo::WiredTigerRecordStore::Iterator::_getNext(),mongo::WiredTigerRecordStore::Iterator::getNext(),[...]
      10 select,mongo::Listener::initAndListen(),mongo::webServerThread(mongo::AdminAccess,??,start_thread,clone
      10 select,mongo::Listener::initAndListen(),mongo::initAndListen(int),main
      10 sched_yield,__wt_page_in_func,__wt_row_search,__wt_btcur_search_near,??,mongo::WiredTigerRecordStore::Iterator::_locate(mongo::RecordId,mongo::WiredTigerRecordStore::Iterator::restoreState(mongo::OperationContext*),mongo::CollectionScan::restoreState(mongo::OperationContext*),[...]
      10 recv,mongo::Socket::_recv(char*,,mongo::Socket::unsafe_recv(char*,,mongo::Socket::recv(char*,,mongo::MessagingPort::recv(mongo::Message&),mongo::PortMessageServer::handleIncomingMsg(void*),start_thread,clone
      10 pthread_cond_wait@@GLIBC_2.3.2,mongo::DeadlineMonitor<mongo::V8Scope>::deadlineMonitorThread(),??,start_thread,clone
      10 pthread_cond_timedwait@@GLIBC_2.3.2,mongo::RangeDeleter::doWork(),??,start_thread,clone
      10 pthread_cond_timedwait@@GLIBC_2.3.2,bool,??,mongo::BackgroundJob::jobBody(),??,start_thread,clone
      10 nanosleep,mongo::sleepsecs(int),mongo::TTLMonitor::run(),mongo::BackgroundJob::jobBody(),??,start_thread,clone
      10 nanosleep,mongo::sleepsecs(int),mongo::SnapshotThread::run(),mongo::BackgroundJob::jobBody(),??,start_thread,clone
      10 nanosleep,mongo::sleepsecs(int),mongo::ClientCursorMonitor::run(),mongo::BackgroundJob::jobBody(),??,start_thread,clone
      10 do_sigwait,sigwait,??,??,start_thread,clone
      6 sched_yield,__wt_page_in_func,__wt_tree_walk,__wt_btcur_next,??,mongo::WiredTigerRecordStore::Iterator::_getNext(),mongo::WiredTigerRecordStore::Iterator::restoreState(mongo::OperationContext*),mongo::CollectionScan::restoreState(mongo::OperationContext*),[...]
      3 __wt_writelock,__wt_ovfl_discard,__wt_ovfl_track_wrapup,??,__wt_reconcile,__wt_cache_op,__wt_txn_checkpoint,??,??,start_thread,clone
      2 __wt_writelock,__wt_ovfl_track_wrapup,??,__wt_reconcile,__wt_cache_op,__wt_checkpoint,??,__wt_txn_checkpoint,??,??,start_thread,clone
      2 sched_yield,__wt_page_in_func,__wt_tree_walk,__wt_btcur_next,??,mongo::WiredTigerRecordStore::Iterator::_getNext(),mongo::WiredTigerRecordStore::Iterator::restoreState(mongo::OperationContext*),mongo::CollectionScan::restoreState(mongo::OperationContext*),[...]
      2 sched_yield,__wt_page_in_func,__wt_row_search,__wt_btcur_insert,??,mongo::WiredTigerRecordStore::insertRecord(mongo::OperationContext*,,mongo::Collection::_insertDocument(mongo::OperationContext*,,mongo::Collection::insertDocument(mongo::OperationContext*, [...]
      

        1. profileroutput.txt
          11 kB
        2. fsm_all.js
          1 kB
        3. findAndModify_update_grow.js
          4 kB

            Assignee:
            keith.bostic@mongodb.com Keith Bostic (Inactive)
            Reporter:
            daniel.alabi@10gen.com Daniel Alabi
            Votes:
            0 Vote for this issue
            Watchers:
            10 Start watching this issue

              Created:
              Updated:
              Resolved: