When a python test dumps core during PR testing or during other Evergreen test runs, it would be easier to triage the problem if the evergreen logs contained a stack trace from the failed job(s).
WT-7656, for example, was generated by a python test that segfaulted. The only output in the logs was the following generic failure description:
[2021/06/07 06:08:15.948] ====================================================================== [2021/06/07 06:08:15.948] ERROR: test_tiered05.test_tiered05.test_tiered (subunit.RemotedTestCase) [2021/06/07 06:08:15.948] test_tiered05.test_tiered05.test_tiered [2021/06/07 06:08:15.948] ---------------------------------------------------------------------- [2021/06/07 06:08:15.948] testtools.testresult.real._StringException: lost connection during test 'test_tiered05.test_tiered05.test_tiered' [2021/06/07 06:08:15.948] ----------------------------------------------------------------------
Most of the time required to identify and diagnose the failure was downloading the artifacts, finding the core file, and getting gdb connected.
When a Python test fails, it seems it would be easy to automatically use the name of the test (e.g., test_tiered05.test_tiered05.test_tiered in the example, above) to find the test directory and check there for a core file (i.e., dump_*.core), and extract the stack traces from it. A nice feature of our python tests is that they rarely have more than a handful of threads, so we could dump all of the stacks. Or of it is simpler even just the stack of the active thread would be helpful.
I expect that a segfaults are fairly rare among python test failures. But this would be useful when they do happen.
- is depended on by
-
WT-8328 Dump and upload backtraces when python and test format fails in evergreen testing
- Closed