[SERVER-33800] Hang opening 64K cursors on a single table Created: 10/Mar/18  Updated: 20/Jun/18  Resolved: 26/Mar/18

Status: Closed
Project: Core Server
Component/s: WiredTiger
Affects Version/s: 3.4.9, 3.4.13, 3.6.3
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Bruce Lucas (Inactive) Assignee: Michael Cahill (Inactive)
Resolution: Duplicate Votes: 1
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: Text File stacks.txt    
Issue Links:
Depends
Duplicate
duplicates WT-3972 Allow more than 64K cursors to be ope... Closed
Related
Operating System: ALL
Participants:
Case:

 Description   

Following script does queries on 10 threads and quickly hangs after printing "starting queries"

db=/ssd/db
uri="mongodb://localhost:27017/test?replicaSet=rs"
 
function clean {
    killall -9 -w mongod
    rm -rf $db
    mkdir -p $db/{r0,r1,r2}
}
 
function start {
    killall -w mongod
    opts=""
    mongod --dbpath $db/r0 --port 27017 --logpath $db/r0.log --logappend --replSet rs --fork $opts
    mongod --dbpath $db/r1 --port 27117 --logpath $db/r1.log --logappend --replSet rs --fork $opts
    mongod --dbpath $db/r2 --port 27217 --logpath $db/r2.log --logappend --replSet rs --fork $opts
}
 
function initrs {
    mongo --quiet --eval 'rs.initiate({_id: "rs", members: [{_id: 0, host: "127.0.0.1:27017"}]})'
    mongo --quiet --eval 'rs.add("127.0.0.1:27117")'
    mongo --quiet --eval 'rs.add("127.0.0.1:27217")'
}
 
function waitrs {
    mongo $uri --quiet --eval "db.version()"
}
 
function setup {
    mongo $uri --eval '
        db.c.drop()
        db.c.createIndex({_id: 1, x: 1})
        for (var i = 0; i < 10000; i++)
            db.c.insert({_id: i, x: i})
    '
}
 
function run {
    threads=10
    for i in $(seq $threads); do
        mongo $uri --eval '
            terms = []
            for (var i = 0; i < 10000; i++)
                terms.push({_id: i, x: i})
            print("starting queries")
            for (var i = 0; ; i++) {
                count = db.c.aggregate([{$match: {$or: terms}}]).itcount()
                print(new Date().toISOString(), i, count)
            }
        ' &
    done
    wait
}
 
clean; start; initrs; waitrs
setup
run

Stack traces attached. It appears all threads in wt are either idle or are hung in __wt_readlock:

  10 __wt_cond_wait_signal,__wt_readlock,__wt_session_lock_dhandle,__wt_session_get_btree,__wt_session_get_btree_ckpt,__wt_curfile_open,__session_open_cursor_int,__wt_curtable_open,__session_open_cursor_int,__session_open_cursor,mongo::WiredTigerSession::getCursor(std::__cxx11::basic_string<char,,mongo::WiredTigerCursor::WiredTigerCursor(std::__cxx11::basic_string<char,,mongo::WiredTigerRecordStore::getCursor(mongo::OperationContext*,,mongo::Collection::getCursor(mongo::OperationContext*,,mongo::FetchStage::doWork(unsigned,mongo::PlanStage::work(unsigned,mongo::OrStage::doWork(unsigned,mongo::PlanStage::work(unsigned,mongo::PlanStage::work(unsigned,mongo::PlanExecutor::getNextImpl(mongo::Snapshotted<mongo::BSONObj>*,,mongo::PlanExecutor::getNext(mongo::BSONObj*,,mongo::DocumentSourceCursor::loadBatch(),mongo::DocumentSourceCursor::getNext(),mongo::Pipeline::getNext(),mongo::PipelineProxyStage::getNextBson(),mongo::PipelineProxyStage::doWork(unsigned,mongo::PlanStage::work(unsigned,mongo::PlanExecutor::getNextImpl(mongo::Snapshotted<mongo::BSONObj>*,,mongo::PlanExecutor::getNext(mongo::BSONObj*,,mongo::GetMoreCmd::generateBatch(mongo::ClientCursor*,,mongo::GetMoreCmd::runParsed(mongo::OperationContext*,,mongo::GetMoreCmd::run(mongo::OperationContext*,,mongo::Command::run(mongo::OperationContext*,,mongo::Command::execCommand(mongo::OperationContext*,,mongo::runCommands(mongo::OperationContext*,,mongo::assembleResponse(mongo::OperationContext*,,mongo::ServiceEntryPointMongod::_sessionLoop(std::shared_ptr<mongo::transport::Session>,std::_Function_handler<void,mongo::(anonymous,start_thread,clone
   4 __wt_cond_wait_signal,__wt_cond_auto_wait_signal,__wt_cond_auto_wait,__wt_evict_thread_run,__wt_thread_run,start_thread,clone
   1 __wt_cond_wait_signal,__wt_readlock,__wt_session_lock_dhandle,__wt_session_get_btree,__conn_btree_apply_internal,__wt_conn_btree_apply,__checkpoint_apply_all,__checkpoint_prepare,__txn_checkpoint_wrapper,__wt_txn_checkpoint,__session_checkpoint,__ckpt_server,start_thread,clone
   1 __wt_cond_wait_signal,__wt_cond_auto_wait_signal,__wt_cond_auto_wait,__log_wrlsn_server,start_thread,clone
   1 __wt_cond_wait_signal,__wt_cond_auto_wait_signal,__log_server,start_thread,clone
   1 __wt_cond_wait_signal,__sweep_server,start_thread,clone
   1 __wt_cond_wait_signal,__log_file_server,start_thread,clone



 Comments   
Comment by Michael Cahill (Inactive) [ 26/Mar/18 ]

The underlying issue has been fixed in WT-3972. There are no further changes to MongoDB currently under consideration for this issue.

Comment by Michael Cahill (Inactive) [ 11/Mar/18 ]

We can fix this specific case in WT-3972. I'm going to leave this ticket open for now because we don't have test coverage of workloads with more than 64K cursors on a single collection / index. Query plans with this behavior are likely to be suboptimal even if they don't hang.

Comment by Keith Bostic (Inactive) [ 11/Mar/18 ]

What's happening here is this workload opens 64K cursors on a single data handle, then the __wt_readlock function blocks when attempting to open more, waiting for the WT_RWLOCK.readers_active field to drain. This means a single thread can block itself, or multiple threads can get into an undetected deadlock.

Comment by Bruce Lucas (Inactive) [ 11/Mar/18 ]

Tried some experiments varying

  • number of threads: the more threads, the more quickly and more likely it is to get stuck, suggesting some kind of race or contention is involved
  • find vs aggregate: aggregate seemed more likely to get stuck; possibly some difference in yielding frequency or number of cursors opened between the two?

In cases where it got stuck, ftdc data shows the number of open WT cursors rose quickly to just over 64 k at which point it was stuck. If it didn't get stuck, the number of open WT cursors would rise to something less than 64 k, then vary up and down, e.g. from ~30 k cursors to ~45 k, with a period of a few seconds. Why are we opening so many WT cursors given that we cache cursors? Is there some table or counter that overflows at 64 k?

Generated at Thu Feb 08 04:34:37 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.