[SERVER-17021] Significant WT Performance Regression on "modified" FSM workload Created: 23/Jan/15  Updated: 18/Sep/15  Resolved: 28/Jan/15

Status: Closed
Project: Core Server
Component/s: Concurrency, Storage
Affects Version/s: 2.8.0-rc5
Fix Version/s: 3.0.0-rc7

Type: Bug Priority: Major - P3
Reporter: Daniel Alabi Assignee: Keith Bostic (Inactive)
Resolution: Done Votes: 0
Labels: 28qa
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: File findAndModify_update_grow.js     File fsm_all.js     Text File profileroutput.txt    
Issue Links:
Related
Backwards Compatibility: Fully Compatible
Operating System: Linux
Steps To Reproduce:

The findAndModify_update_grow.js FSM workload spawns several threads. Each thread inserts a single document into a collection and then, using the findAndModify command, attempts to trigger a document move by growing (doubling) the size of the inserted document using the $set and $mul update operators.

The "modified" jstests/concurrency/fsm_workloads/findAndModify_update_grow.js uses 20 threads and 50 steps (iterations = 50) and attempts to quadruple the size of the document.

My experiments were done on a rhel55-test spawnhost (8GB RAM and 8GB disk space).

On mmapv1, running the attached fsm_all.js takes about 2 mins but for WT it kept running for 30+ mins (until I shut it down).

WT invocation:

python buildscripts/smoke.py --nopreallocj --with-cleanbb --mongod ~/mongodb-linux-64/bin/mongod --mongo ~/mongodb-linux-64/bin/mongo --continue-on-failure --storageEngine=wiredTiger --wiredTigerEngineConfig=cache_size=1G concurrency 

Participants:
Linked BF Score: 0

 Description   

A modified version of

jstests/concurrency/fsm_workloads/findAndModify_update_grow.js

(see attachments) seems to be stalling significantly on IO, specifically when waiting to read WT overflow items from disk.

Here is an abridged version of a poor man's profiler output (see: http://poormansprofiler.org/). Full profiler output is attached.

(# times this stack occurred, thread stack)
41 pthread_cond_timedwait@@GLIBC_2.3.2,__wt_cond_wait,??,start_thread,clone
39 __wt_readlock,__wt_ovfl_read,__wt_kv_return,__wt_btcur_search_near,??,mongo::WiredTigerRecordStore::Iterator::_locate(mongo::RecordId,mongo::WiredTigerRecordStore::Iterator::restoreState(mongo::OperationContext*),mongo::CollectionScan::restoreState(mongo::OperationContext*), [...]
35 __wt_readlock,__wt_ovfl_read,__wt_btcur_next,??,mongo::WiredTigerRecordStore::Iterator::_getNext(),mongo::WiredTigerRecordStore::Iterator::restoreState(mongo::OperationContext*),mongo::CollectionScan::restoreState(mongo::OperationContext*),[...]
30 __wt_readlock,__wt_ovfl_read,__wt_btcur_next,??,mongo::WiredTigerRecordStore::Iterator::_getNext(),mongo::WiredTigerRecordStore::Iterator::getNext(),mongo::CollectionScan::work(unsigned,mongo::SortStage::work(unsigned,mongo::PlanExecutor::getNext(mongo::BSONObj*,,mongo::CmdFindAndModify::runImpl(mongo::OperationContext*,,mongo::CmdFindAndModify::run(mongo::OperationContext*,[...]
27 __wt_readlock,__wt_ovfl_read,__wt_kv_return,__wt_btcur_search_near,??,mongo::WiredTigerRecordStore::Iterator::_locate(mongo::RecordId,mongo::WiredTigerRecordStore::Iterator::restoreState(mongo::OperationContext*),mongo::CollectionScan::restoreState(mongo::OperationContext*),[...]
23 __wt_readlock,__wt_ovfl_read,__wt_btcur_next,??,mongo::WiredTigerRecordStore::Iterator::_getNext(),mongo::WiredTigerRecordStore::Iterator::restoreState(mongo::OperationContext*),mongo::CollectionScan::restoreState(mongo::OperationContext*),[...]
22 __wt_readlock,__wt_ovfl_read,__wt_btcur_next,??,mongo::WiredTigerRecordStore::Iterator::_getNext(),mongo::WiredTigerRecordStore::Iterator::getNext(),[...]
10 select,mongo::Listener::initAndListen(),mongo::webServerThread(mongo::AdminAccess,??,start_thread,clone
10 select,mongo::Listener::initAndListen(),mongo::initAndListen(int),main
10 sched_yield,__wt_page_in_func,__wt_row_search,__wt_btcur_search_near,??,mongo::WiredTigerRecordStore::Iterator::_locate(mongo::RecordId,mongo::WiredTigerRecordStore::Iterator::restoreState(mongo::OperationContext*),mongo::CollectionScan::restoreState(mongo::OperationContext*),[...]
10 recv,mongo::Socket::_recv(char*,,mongo::Socket::unsafe_recv(char*,,mongo::Socket::recv(char*,,mongo::MessagingPort::recv(mongo::Message&),mongo::PortMessageServer::handleIncomingMsg(void*),start_thread,clone
10 pthread_cond_wait@@GLIBC_2.3.2,mongo::DeadlineMonitor<mongo::V8Scope>::deadlineMonitorThread(),??,start_thread,clone
10 pthread_cond_timedwait@@GLIBC_2.3.2,mongo::RangeDeleter::doWork(),??,start_thread,clone
10 pthread_cond_timedwait@@GLIBC_2.3.2,bool,??,mongo::BackgroundJob::jobBody(),??,start_thread,clone
10 nanosleep,mongo::sleepsecs(int),mongo::TTLMonitor::run(),mongo::BackgroundJob::jobBody(),??,start_thread,clone
10 nanosleep,mongo::sleepsecs(int),mongo::SnapshotThread::run(),mongo::BackgroundJob::jobBody(),??,start_thread,clone
10 nanosleep,mongo::sleepsecs(int),mongo::ClientCursorMonitor::run(),mongo::BackgroundJob::jobBody(),??,start_thread,clone
10 do_sigwait,sigwait,??,??,start_thread,clone
6 sched_yield,__wt_page_in_func,__wt_tree_walk,__wt_btcur_next,??,mongo::WiredTigerRecordStore::Iterator::_getNext(),mongo::WiredTigerRecordStore::Iterator::restoreState(mongo::OperationContext*),mongo::CollectionScan::restoreState(mongo::OperationContext*),[...]
3 __wt_writelock,__wt_ovfl_discard,__wt_ovfl_track_wrapup,??,__wt_reconcile,__wt_cache_op,__wt_txn_checkpoint,??,??,start_thread,clone
2 __wt_writelock,__wt_ovfl_track_wrapup,??,__wt_reconcile,__wt_cache_op,__wt_checkpoint,??,__wt_txn_checkpoint,??,??,start_thread,clone
2 sched_yield,__wt_page_in_func,__wt_tree_walk,__wt_btcur_next,??,mongo::WiredTigerRecordStore::Iterator::_getNext(),mongo::WiredTigerRecordStore::Iterator::restoreState(mongo::OperationContext*),mongo::CollectionScan::restoreState(mongo::OperationContext*),[...]
2 sched_yield,__wt_page_in_func,__wt_row_search,__wt_btcur_insert,??,mongo::WiredTigerRecordStore::insertRecord(mongo::OperationContext*,,mongo::Collection::_insertDocument(mongo::OperationContext*,,mongo::Collection::insertDocument(mongo::OperationContext*, [...]



 Comments   
Comment by Githook User [ 27/Jan/15 ]

Author:

{u'username': u'alabid', u'name': u'alabid', u'email': u'alabidan@gmail.com'}

Message: SERVER-17021 Blacklist findAndModify_update_grow.js to prevent spurious MCI timeouts

Closes #917

Signed-off-by: Benety Goh <benety@mongodb.com>
(cherry picked from commit 6fdd15e3e68c53d555ddd40402aefd156507b6a8)
Branch: v3.0
https://github.com/mongodb/mongo/commit/ef2ab17fb0c0ab8ed8c3ac6fb963fff80dc21978

Comment by Githook User [ 27/Jan/15 ]

Author:

{u'username': u'alabid', u'name': u'alabid', u'email': u'alabidan@gmail.com'}

Message: SERVER-17021 Blacklist findAndModify_update_grow.js to prevent spurious MCI timeouts

Closes #917

Signed-off-by: Benety Goh <benety@mongodb.com>
Branch: master
https://github.com/mongodb/mongo/commit/6fdd15e3e68c53d555ddd40402aefd156507b6a8

Generated at Thu Feb 08 03:43:02 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.