[GODRIVER-198] Ensure panics are handled when running functions in goroutines Created: 17/Jan/18 Updated: 28/Oct/23 Resolved: 07/May/18 |
|
| Status: | Closed |
| Project: | Go Driver |
| Component/s: | Error Handling |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Improvement | Priority: | Major - P3 |
| Reporter: | Kristofer Brandow (Inactive) | Assignee: | Unassigned |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | Stitch | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||||||
| Description |
|
In the mongo/collection.go file there are a number of places where we invoke a function and do not handle the case of a panic occurring. Since any unhandled panic will crash the process, we need to handle these panics, even if we throw it out. Additionally, we might want to change these methods so they are always concurrent and use channels to send the results back to the caller. |
| Comments |
| Comment by Jeffrey Yemin [ 23/Apr/18 ] | ||||||||||||||||
|
Kris thinks this has been done in scope of | ||||||||||||||||
| Comment by David Golden [ 23/Jan/18 ] | ||||||||||||||||
|
I've looked at about a dozen articles on goroutine panic and recovery in Go. Only one that I've seen makes a decent case for deferring a recovery within a goroutine: Golang pearl: It’s dangerous to go alone!. Specifically, while the http package will recover panics from handlers, it can't recover panics from goroutines started by handlers and we don't want the whole http server going down because of a panic in one request. Most other articles stress that one should recover explicit panics from within the same package, e.g. Panic and Recover from the Go wiki. We use very few explicit panics and we might want to reconsider more idiomatic ways of propagating errors (I'll open a separate ticket for that). The Golang pearl article gives this example:
That doesn't work for us unless we introduce a logger (or allow users to do dependency injection) – we may choose to do that later when we get to APM, but I think that's premature to consider now. However, the broader point that we shouldn't let a runtime panic in a goroutine take down a server is compelling. For collection.go, with unack'd writes, we should introduce recovery that throws away panics since the whole point is to fire-and-forget and not wait to see if an error occurs. There are several more goroutine launches in SDAM code and we'll need to look carefully at those one-by-one to determine what to do in the case of a panic. For example, if a server monitor panics, how do we detect it or recover from it? Should we consider the client wedged and error on every server selection? We should see what other drivers do to catch exceptions in monitoring code. Given the scope of this work to find a good recovery solution, I'm comfortable deferring it to the beta release series. | ||||||||||||||||
| Comment by David Golden [ 21/Jan/18 ] | ||||||||||||||||
|
User provided functions? Or internal ones? If user functions panic, that's not really our concern. And for our functions, we should write them so they don't panic. Can you give some examples, please? |