Loading...

XML

Word

Printable

JSON

Type: Bug
Resolution: Done
Priority: Major - P3
Fix Version/s: 3.3.1
Affects Version/s: 3.1.9
Component/s: Shell, Testing Infrastructure
Labels:
None

Backwards Compatibility:
Fully Compatible
Operating System:
ALL
Sprint:
Build C (11/20/15), Build D (12/11/15), Build E (01/08/16), Build F (01/29/16)
Confidence Status:
None
Work Order:
3
CAR Domain/s:
None

Aha! Reference:
None
Tracking Level:
None
Risk Status:
None
Exec Notes:
None
Goal Name(s):
None
Goal Link:
None

_stopMongoProgram() returns the exit code of the process, which gets propagated by MongoRunner.stopMongod() and MongoRunner.stopMongos(). However, none of the tests or testing infrastructure actually check the return value. It'd be nice to assert that the exit code is zero (in situations where the MongoDB processes aren't expected to crash), but doing so is hampered by how the process can be terminated with a SIGKILL if it takes longer than a minute to shut down.

Tests that start their own MongoDB deployments currently do not fail when LeakSanitizer reports that there were memory leaks.

int killDb(int port, ProcessId _pid, int signal, const BSONObj& opt) {
    ProcessId pid;
    int exitCode = 0;
    if (port > 0) {
        if (!registry.isPortRegistered(port)) {
            log() << "No db started on port: " << port << endl;
            return 0;
        }
        pid = registry.pidForPort(port);
    } else {
        pid = _pid;
    }

    kill_wrapper(pid, signal, port, opt);

    int i = 0;
    for (; i < 130; ++i) {
        if (i == 60) {
            log() << "process on port " << port << ", with pid " << pid
                  << " not terminated, sending sigkill" << endl;
            kill_wrapper(pid, SIGKILL, port, opt);
        }
        if (wait_for_pid(pid, false, &exitCode))
            break;
        sleepmillis(1000);
    }
    if (i == 130) {
        log() << "failed to terminate process on port " << port << ", with pid " << pid << endl;
        verify("Failed to terminate process" == 0);
    }

    registry.deleteProgram(pid);
    // FIXME I think the intention here is to do an extra sleep only when SIGKILL is sent to the
    // child process. We may want to change the 4 below to 29, since values of i greater than that
    // indicate we sent a SIGKILL.
    if (i > 4 || signal == SIGKILL) {
        sleepmillis(4000);  // allow operating system to reclaim resources
    }

    return exitCode;
}

is duplicated by

SERVER-4959 Check exit code when killing servers in tests

Closed

Assignee:: Jonathan Reams
Reporter:: Max Hirschhorn
Participants:: Githook User, Jonathan Reams, Max Hirschhorn
Votes:: 0 Vote for this issue
Watchers:: 6 Start watching this issue

Created:: Oct 08 2015 11:24:03 PM UTC
Updated:: Jan 12 2017 02:53:14 AM UTC
Resolved:: Jan 19 2016 10:02:21 PM UTC
Confidence Status Last Update:: 01/Dec/15 7:25 PM

Details

Description

Attachments

Issue Links

Forms

Activity

People

Dates