[SERVER-50282] Provide a debugging setup script for spawnhosts that load artifacts with coredumps Created: 19/May/20  Updated: 29/Oct/23  Resolved: 20/Aug/20

Status: Closed
Project: Core Server
Component/s: Testing Infrastructure
Affects Version/s: None
Fix Version/s: 4.7.0

Type: Improvement Priority: Major - P3
Reporter: Daniel Gottlieb (Inactive) Assignee: Daniel Gottlieb (Inactive)
Resolution: Fixed Votes: 0
Labels: cli
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
is depended on by SERVER-50298 Remove spawnhost related debugging se... Backlog
Related
Backwards Compatibility: Fully Compatible
Sprint: Sharding 2020-08-24
Participants:
Linked BF Score: 6
Story Points: 0

 Description   

Overlong filenames truncating important properties is actually a bug. This ticket has been repurposed to provide a script that unpackages files necessary for inspecting a coredump on a spawnhost. It assumes the bug will be eventually fixed (and the bug only impacts a subset of cases).

Original Description
The only time I spawn a host with data files from a test failure is when there's an available core dump that I want to load in GDB. I have a script that programmatically unpackages everything into the appropriate directory. Whether or not server engineers use a script to set up their gdb usage, I believe spawning a host to investigate a core dump is a common use-case.

Unfortunately when filenames are long, important properties can be trimmed such as the keyword coredump[1]. What makes this difficult is that it not only breaks my script (acceptable, this sort of scripting isn't supported or built on some established agreement), but it also breaks my ability to do the corollary work by hand.

Doing a tar -tf <archive> AFAIK is a complete filescan. At that point it's faster to just download the coredumps by hand. This arguably defeats the purpose of spawning a host with artifacts loaded.

I don't know what a feasible solution here is. There's probably a reason why filenames are long (for uniqueness? though IMO, unreadable). Some ideas:

  • Use shorter strings for evergreen fetch to generate, which preserve the contents of the archive (at the expense of labeling the variant/task id which AFAIK only becomes a problem if a user fetches artifacts for multiple tasks in the same directory).
    • If this is backwords breaking for established use-cases, consider adding a flag to fetch, e.g: evergreen fetch -t <task> --artifacts --shortnames. Let users spawning a host and loading data to opt-in to short filenames
  • Add environment variables containing absolute paths to interesting artifacts for users sshing into the instance. Scripts can hook into these without needing to rely on filename patterns. E.g:
    • BIN_ARCHIVE for the archive containing mongod
    • DBG_ARCHIVE for the archive containing mongod.debug
    • COREDUMP_ARCHIVE for the archive containing all coredumps
    • SRC_DIR for the mongodb repository path fetch --sources)

[1]

[root@ip-10-122-8-102 me]# ll /data/mci/artifacts-patch-1419_linux-64-debug_*
/data/mci/artifacts-patch-1419_linux-64-debug_compile:
total 2503040
-rw-r--r-- 1 root root     136935 May 19 01:45 config-mongodb_mongo_v4.4_linux_64_debug_patch_1d5d11155689d29bb7de42ccb5a5f4b3c7247469_5ebf0cd932f4170aad0ca35f_20_05_15_21_43_14.log
-rw-r--r-- 1 root root 2473743732 May 19 01:46 debugsymbols-mongodb_mongo_v4.4_linux_64_debug_patch_1d5d11155689d29bb7de42ccb5a5f4b3c7247469_5ebf0cd932f4170aad0ca35f_20_05_15_21_43_14.tgz
-rw-r--r-- 1 root root   84980170 May 19 01:45 mongo-mongodb_mongo_v4.4_linux_64_debug_patch_1d5d11155689d29bb7de42ccb5a5f4b3c7247469_5ebf0cd932f4170aad0ca35f_20_05_15_21_43_14.tgz
-rw-r--r-- 1 root root    3536789 May 19 01:45 mongodb_mongo_v4.4_linux_64_debug_patch_1d5d11155689d29bb7de42ccb5a5f4b3c7247469_5ebf0cd932f4170aad0ca35f_20_05_15_21_43_14.tgz
-rw-r--r-- 1 root root       1097 May 19 01:45 pip-requirements-mongodb_mongo_v4.4_linux_64_debug_compile_patch_1d5d11155689d29bb7de42ccb5a5f4b3c7247469_5ebf0cd932f4170aad0ca35f_20_05_15_21_43_14-0.txt
-rw-r--r-- 1 root root     699562 May 19 01:45 scons-cache-mongodb_mongo_v4.4_linux_64_debug_patch_1d5d11155689d29bb7de42ccb5a5f4b3c7247469_5ebf0cd932f4170aad0ca35f_20_05_15_21_43_14-0.log
/data/mci/artifacts-patch-1419_linux-64-debug_jsCore:
total 120
-rw-r--r-- 1 root root 80088 May 19 01:45 Running-Tests-from-Evergreen-Tasks-Locally
-rw-r--r-- 1 root root  1446 May 19 01:45 mongo-diskstats-mongodb_mongo_v4.4_linux_64_debug_jsCore_patch_1d5d11155689d29bb7de42ccb5a5f4b3c7247469_5ebf0cd932f4170aad0ca35f_20_05_15_21_43_14-0.tgz
-rw-r--r-- 1 root root 29980 May 19 01:45 mongo-system-resource-info-mongodb_mongo_v4.4_linux_64_debug_jsCore_patch_1d5d11155689d29bb7de42ccb5a5f4b3c7247469_5ebf0cd932f4170aad0ca35f_20_05_15_21_43_14-0.tgz
-rw-r--r-- 1 root root  1097 May 19 01:45 pip-requirements-mongodb_mongo_v4.4_linux_64_debug_jsCore_patch_1d5d11155689d29bb7de42ccb5a5f4b3c7247469_5ebf0cd932f4170aad0ca35f_20_05_15_21_43_14-0.txt
/data/mci/artifacts-patch-1419_linux-64-debug_retryable_writes_jscore_stepdown_passthrough:
total 1204712
-rw-r--r-- 1 root root      80088 May 19 01:45 Running-Tests-from-Evergreen-Tasks-Locally
-rw-r--r-- 1 root root 1212594087 May 19 01:46 m.4_linux_64_debug_patch_1d5d11155689d29bb7de42ccb5a5f4b3c7247469_5ebf0cd932f4170aad0ca35f_20_05_15_21_43_14-retryable_writes_jscore_stepdown_passthrough-0.tgz
-rw-r--r-- 1 root root      10900 May 19 01:45 m.4_linux_64_debug_retryable_writes_jscore_stepdown_passthrough_patch_1d5d11155689d29bb7de42ccb5a5f4b3c7247469_5ebf0cd932f4170aad0ca35f_20_05_15_21_43_14-0.tgz
-rw-r--r-- 1 root root   20560324 May 19 01:45 m_(1).4_linux_64_debug_patch_1d5d11155689d29bb7de42ccb5a5f4b3c7247469_5ebf0cd932f4170aad0ca35f_20_05_15_21_43_14-retryable_writes_jscore_stepdown_passthrough-0.tgz
-rw-r--r-- 1 root root     262274 May 19 01:45 m_(1).4_linux_64_debug_retryable_writes_jscore_stepdown_passthrough_patch_1d5d11155689d29bb7de42ccb5a5f4b3c7247469_5ebf0cd932f4170aad0ca35f_20_05_15_21_43_14-0.tgz
-rw-r--r-- 1 root root     101019 May 19 01:45 m_(2).4_linux_64_debug_patch_1d5d11155689d29bb7de42ccb5a5f4b3c7247469_5ebf0cd932f4170aad0ca35f_20_05_15_21_43_14-retryable_writes_jscore_stepdown_passthrough-0.tgz
-rw-r--r-- 1 root root       1097 May 19 01:45 p.4_linux_64_debug_retryable_writes_jscore_stepdown_passthrough_patch_1d5d11155689d29bb7de42ccb5a5f4b3c7247469_5ebf0cd932f4170aad0ca35f_20_05_15_21_43_14-0.txt



 Comments   
Comment by Githook User [ 20/Aug/20 ]

Author:

{'name': 'Daniel Gottlieb', 'email': 'daniel.gottlieb@mongodb.com', 'username': 'dgottlieb'}

Message: SERVER-50282: Generate debugging setup script for spawnhosts that load artifacts with coredumps.
Branch: master
https://github.com/mongodb/mongo/commit/7ab67b9007176b6d80f0e460803c5ffc0737ae2e

Comment by Daniel Gottlieb (Inactive) [ 12/Aug/20 ]

I've moved this into a server ticket. As part of my investigation, I found the original problem of trimming the beginning of filenames was actually a bug. Given that, I was able to focus this ticket into just providing a script that can set up a spawnhost experience for loading gdb against a coredump. I was able to achieve that with a patch that only touches the server repository.

That said, the usability/discoverability of the patch could be improved, likely requiring help from evergreen, but that can be discussed and ticketed out separately.

Comment by Brooke Miller [ 06/Aug/20 ]

We are timeboxing this investigation for Dan to 5 days.

Comment by Daniel Gottlieb (Inactive) [ 06/Aug/20 ]

I'm going to take this ticket for the next sprint to investigate a different angle where this can be tackled more on the mongodb side. If the investigation proves fruitful, I'll move this ticket into the SERVER project. There might still be some asks from Evergreen (to be filed separately), but I expect they'd be server agnostic.

Comment by Chaya Malik [ 09/Jul/20 ]

Since the file names are defined in the s3_put command in the project YAML and it's not Evergreen generating the file name, it is hard to shorten it in a meaningful way. We can attempt to shorten it on fetch, but it would be hard for us to parse out the unimportant parts in a generic way. Adding environment variables containing absolute paths such as COREDUMP_ARCHIVE would mean adding task-specific information in the Evergreen code base, as opposed to keeping Evergreen general. It seems like there is no ideal solution to this problem.

Generated at Thu Feb 08 05:22:16 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.