[SERVER-52532] Investigate if code during signal handling should use quickExitWithoutLogging Created: 02/Nov/20 Updated: 23/Feb/23 Resolved: 10/Feb/23 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Internal Code |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Improvement | Priority: | Major - P3 |
| Reporter: | Kevin Pulo | Assignee: | Billy Donahue |
| Resolution: | Won't Do | Votes: | 0 |
| Labels: | re-triaged-ticket, servicearch-wfbf-day | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||
| Assigned Teams: |
Service Arch
|
||||||||
| Sprint: | Service Arch 2023-02-20 | ||||||||
| Participants: | |||||||||
| Description |
|
[*]: Technically, it added code to the callers of quickExit(), by renaming the original quickExit() to quickExitWithoutLogging(), and making a new inline quickExit() that calls quickExitWithoutLogging(). |
| Comments |
| Comment by Billy Donahue [ 10/Feb/23 ] |
|
This is probably best handled as part of a project to fix up our signal handlers and their unsafe behaviors in a more thorough way. We're talking about maybe having a sidecar process that just handles log message and/or signal diagnostics. The idea that our signal handlers are doing complicated work on their own is IMO not really feasible long-term. It's not even safe to get a backtrace, because unwinding can require opening debuginfo sections that are compressed, and so there's allocations involved etc. It's really best if the signal handlers just set a flag and await roadside assistance. |
| Comment by Billy Donahue [ 10/Feb/23 ] |
I did some research to try to answer these concerns. 1/ Yes, loading the atomic with the default .load memory order is safe and consistent. No problem there. 2/ The logging is not really OK according to spec, but we do it all over these signal handlers already, and tassert isn't making that situation worse than it already is. It's just a little more of the same. the MallocFreeStreamToLog has truncation disabled because it's dumping a stack trace and other potentially large data accumulated in the stream, so we're trying to avoid truncation of that precious diagnostic data. I also don't think the signal handers are calling quickExit. Maybe they used to when this was written? There's one quickExit call which happens when we catch recursive logging but that's unusual. That would occur only if we were inside a log statement and a synchronous signal was generated during the logging. I think we can keep this one closed. We do know that the signal handlers don't make a very big effort to be sync-signal-safe. I think the allocator and C library are making it possible for use to be sloppy about certain technically unsafe things we're doing in the handlers. The handlers should probably just be setting bits, signalling an outside process to take action, and halting. Let the outside process do the rest. |
| Comment by Kevin Pulo [ 02/Nov/20 ] |
|
The main things of concern are:
|