[SERVER-22879] Bench.cpp should fail and report error on connection failure Created: 26/Feb/16  Updated: 06/Dec/22  Resolved: 05/Nov/21

Status: Closed
Project: Core Server
Component/s: Performance
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: David Daly Assignee: Backlog - Server Tooling and Methods (STM) (Inactive)
Resolution: Won't Fix Votes: 0
Labels: tig-benchrun
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
Assigned Teams:
Server Tooling & Methods
Operating System: ALL
Participants:
Linked BF Score: 0

 Description   

Bench.cpp uses BenchRunWorkerStateGuard to synchronize worker threads in benchRun after establishing connections to the mongod. If the connection attempt fails, the guard is skipped, and the remaining workers wait forever for the failed thread.

The exception handling code should decrement the counter used by BenchRunWorkerStateGuard so the other threads don't hang, and should signal a failure.



 Comments   
Comment by Chibuikem Amaechi [ 13/Jan/18 ]

Hi Everyone,

My proposed change would be to invoke BenchRunState::onWorkerFinished() for each catch block in BenchRunWorker::run(), which would essentially remove the failed thread, and thus, prevent other worker threads from hanging.

Ex.

mongo/src/mongo/shell/bench.cpp

void BenchRunWorker::run() {
    try {
        std::unique_ptr<DBClientBase> conn(_config->createConnection());
        if (!_config->username.empty()) {
            string errmsg;
            if (!conn->auth("admin", _config->username, _config->password, errmsg)) {
                uasserted(15932, "Authenticating to connection for benchThread failed: " + errmsg);
            }
        }
        BenchRunWorkerStateGuard _workerStateGuard(_brState);
        generateLoadOnConnection(conn.get());
    } catch (DBException& e) {
        _brState.onWorkerFinished();
        error() << "DBException not handled in benchRun thread" << causedBy(e) << endl;
    } catch (std::exception& e) {
        _brState.onWorkerFinished();
        error() << "std::exception not handled in benchRun thread" << causedBy(e) << endl;
    } catch (...) {
        _brState.onWorkerFinished();
        error() << "Unknown exception not handled in benchRun thread." << endl;
    }
}

Not quite sure how to go about signaling the thread failure.

Please share your thoughts.

Generated at Thu Feb 08 04:01:41 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.