> That said, crashing the whole webserver because of one misbehaving request is ...

camdencheek · on Jan 11, 2023

> panics aren't "goroutine scoped" in terms of their potential impact

I'm with ya there. However, there are also many classes of logic errors that are not goroutine-scoped. And there are many panics that do not have impact outside of the goroutine's scope. In my experience, this is true of most panics.

In practice, panics happen. They are (almost) always indicative of a bug, and almost always mean there is something that needs fixed. However, if a subsystem of my application is broken and panicking, there's a pretty good chance that reporting the panic without crashing the process will provide a better end user experience than just blowing up.

Yes, that means I'm accepting the risk that my application is left in an inconsistent state, but coupled with good observability/reporting, that's a tradeoff I'm willing to make.

(bonus: this is especially true when propagating panics allow me to capture more debugging information to fix the panics faster)

preseinger · on Jan 11, 2023

> In practice, panics happen.

I guess this is the crux of the issue. I don't think this is true, or needs to be true. It certainly hasn't been my experience. I think assuming panics are normal will take you down some paths that make it basically impossible to write reliable software. But, to each their own.

> I'm accepting the risk that my application is left in an inconsistent state,

Inconsistent state makes it impossible to reason about your program's execution or outcomes. An account value that previously had balance = 0 may now have balance = 1000. Is this acceptable risk?

Groxx · on Jan 12, 2023

Since defers run during panics for exactly this reason, no. You can in fact guarantee that is not the case.

Runtime-safety "panics" in Go, like concurrently modifying and iterating a map that can lead to other memory being corrupted, tend to abort the whole process immediately and not be suppress-able panics.

preseinger · on Jan 12, 2023

> Runtime-safety "panics" in Go, like concurrently modifying and iterating a map that can lead to other memory being corrupted, tend to abort the whole process immediately and not be suppress-able panics.

https://go.dev/doc/effective_go#panic

> The usual way to report an error to a caller is to return an error as an extra return value. . . . But what if the error is unrecoverable? Sometimes the program simply cannot continue. For this purpose, there is a built-in function panic that in effect creates a run-time error that will stop the program

Panics express unrecoverable failures. This is plainly stated in the language documentation. There are exceptions to this rule, but they are exceptional.

Groxx · on Jan 13, 2023

That's a style decision, not a correctness issue. You are claiming it is a correctness issue.

preseinger · on Jan 13, 2023

It is absolutely a correctness issue. Panics do not provide safety guarantees that generalize enough that it is safe to arbitrary recover from them. The statement in the previous sentence is not a subjective opinion, it's a statement of fact. I'm not sure how else to convey this information.

Groxx · on Jan 13, 2023

Panics do not violate any runtime guarantees, and defers run in the presence of panics.

All safety guarantees possible if there were no panics are possible with.

preseinger · on Jan 20, 2023

When some bit of code invokes `panic` it is saying that there is an error which is unrecoverable, and the default expectation is that the process will terminate. There is no way to assert that panics do not violate runtime or memory model expectations. They can.

jasonhansel · on Jan 12, 2023

> An account value that previously had balance = 0 may now have balance = 1000. Is this acceptable risk?

Your entire web app process crashes due to a panic every time a request triggers an extremely rare edge case. A hacker discovers this and uses it to conduct a DoS attack. Is this acceptable risk?

preseinger · on Jan 12, 2023

Yes, definitely preferable! Denial of service is definitely better than invalid state, right?

ikiris · on Jan 12, 2023

Why the heck are you writing web apps that panic?

unscaled · on Jan 12, 2023

This is equivalent to asking "Why the heck are you writing code with bugs?"

Sure, if we could write code without bugs, we wouldn't need to suppress panics. But since we do tend to write code bugs and some of them are bugs that can be detected by the runtime - we get panics.

If you hate panics, you can do better than Go and go for a language with a stronger type system, where you won't get nil pointer panics or interface conversion panics, but even an almost onerously-tyepesafe language like Haskell still panics on some bogus operations such as division by zero or trying to read the head of an empty list. Perhaps Idris really have no runtime errors but they are quite niche.

aetherane · on Jan 12, 2023

It is pretty easy to have accidental panics in Go, for instance due to a runtime assertion that unexpectedly failed

preseinger · on Jan 12, 2023

Runtime assertions without defensive checks are programmer errors that are not difficult to spot in code review and should not be expected to make it to deployed code.

    // RED FLAG
    x := y.(type)

    // good
    x, ok := y.(type)
    if !ok { return an error }

jasonhansel · on Jan 12, 2023

Because people make mistakes?

philosopher1234 · on Jan 12, 2023

Classic Go programmer. This is why I use rust B)

(joke)

unscaled · on Jan 12, 2023

Joking aside, you could clearly plot the probability of running into a runtime error by programming language.

Of course, a language with less runtime errors is a far cry from being a panacea. Avoiding runtime errors is not the same as avoiding all categories of bugs. And while I personally prefer stronger type systems - they definitely come with increasing levels of cognitive cost.

But I still feel that the type-safety vs. runtime trade-off is more often ignored, underestimated or undersold than it is being hyped. Yes, certain languages (cough Rust cough) are being hyped, but not the conscious choice of balancing programmer learning curve with runtime type-safety.

And while on the topic of Rust, it's probably not the best choice for a language that sees less runtime panics. Especially since unwrapping an error is always the easiest way to handle an error, and thus quite common. But lazy error unwrapping aside, Rust does avoid null dereference exceptions, type casting exceptions and most types of race conditions that can be quite prevalent with go[1].

[1]: https://songlh.github.io/paper/go-study.pdf

groestl · on Jan 12, 2023

> assuming panics are normal will take you down some paths that make it basically impossible to write reliable software

Na, citation needed. Assuming "panics are normal" is just extrapolating from "errors are normal". It makes reliable software more reliable.

bsaul · on Jan 12, 2023

it's pretty obvious that it could influence new developers into the wrong direction though. Saying things like "ha, let's not bother checking this, at worst it'll just panic and i'll simply abort the request".

Which would definitely impact the quality of the software overall in a bad way.

groestl · on Jan 12, 2023

I'd not be so sure. Accepting that everything that can fail will fail shaped me as a young developer, and "Exceptional C++" had a huge influence on me. Now my approach for new code I review is this:

* Make sure you support properly unrolling the stack

* Keep a clean failure boundary, probably somewhere on top of your loop

* Fastidiously check your preconditions

* Fail brutally if they're not met

* Improve from there

preseinger · on Jan 12, 2023

Right, all of these are good points, but the problem is that the "failure boundary" of a panic is the entire process. You can't constrain it, or assume that it's scoped to a single goroutine. Errors do not have this property.

groestl · on Jan 15, 2023

> the "failure boundary" of a panic is the entire process.

This is trivially falsifiable by panicking yourself and immediately recovering. Neither failure domain nor failure boundary need to align with the entire process.

preseinger · on Jan 17, 2023

The impact of a specific panic does not extrapolate to the impact of all panics, and my claim is not falsified by such an example. Panics are defined by the language to represent unrecoverable errors.

    func (x *Thing) Method() {
        x.somethingThatPanics()
        x.somethingThatAssumesTheAboveDidntPanic()
    }

Recovering from a panic thrown by Method invalidates the state of the Thing which threw that panic. If that Thing is shared among concurrent actors, the entire program state is invalidated.

groestl · on Jan 17, 2023

> If that Thing is shared among concurrent actors

You're adding preconditions to your claim.

> the entire program state is invalidated.

No, the state represented by a connected graph of variables accessible by the concurrent actors is tainted. This is hardly "the entire program state". Often it's just a few cache entries.

Also, see my first, most important, bullet point:

"* Make sure you support properly unrolling the stack."

Which means a request to Thing errored out, but it never enters an invalid state. If you fail at that, all bets are off. But then you're dealing with a mediocre codebase anyway.

And finally, let me rewrite your example to something, that I see much more often in real life code, which problematic _even without concurrent actors_ because somethingThatAssumesTheAboveDidntReturnAnError might do horrible things all by themself:

func (x *Thing) Method() {

x.somethingThatReturnsAnError()

x.somethingThatAssumesTheAboveDidntReturnAnError()

}

preseinger · on Jan 18, 2023

I'm not sure how to respond to this. You seem to believe that panics express problems which are constrained to the call stack which instantiated the panic. This isn't true. But I'm not sure how to express this to you in a way that will convince you. So I guess we're at a stalemate.

groestl · on Jan 18, 2023

> You seem to believe that panics express problems which are constrained to the call stack which instantiated the panic.

Not inherently, but it's your job as a developer to make sure this is the case, that's what:

"* Make sure you support properly unrolling the stack."

means.

In case you're dealing with unknown code it's your job to find out what the connected graph of potentially tainted objects is and discard them. That's what "keep a clean failure boundary" means.

If you can't, because you don't want to (short lived process, prototypes) or are unable to (hairy ball of code), tearing down the process is indeed the only option and a sane fallback choice made by the language designers. But it's not necessarily a hallmark of robust software.

I hope I had a final shot to clear up what I meant, thanks for the discussion anyway.

preseinger · on Jan 20, 2023

In Go when some code writes `panic` it is expressing an error condition which should not be intercepted by callers and is expected to terminate the process. A panic is not an error, and panics should not be recovered as if they were errors.

preseinger · on Jan 12, 2023

Panics are categorically different than errors. Errors are normal, panics are not normal.

yakaccount4 · on Jan 12, 2023

> I guess this is the crux of the issue. I don't think this is true, or needs to be true. It certainly hasn't been my experience. I think assuming panics are normal will take you down some paths that make it basically impossible to write reliable software. But, to each their own.

I'd rather have the control to log the panic on a service rather than it forcibly dying and taking down any other connections with it. Kube will just spin up a new one anyway, which just introduces a downtime gap that doesn't need to exist.

preseinger · on Jan 12, 2023

I don't think I'm effectively communicating the impact of handling a panic and continuing program execution. A panic that comes from a memory model violation (as one example) can change the value of anything in the memory space of the program. If the program continues, that change will go undetected, and can have results that make the program completely nondeterministic. This isn't a doom and gloom, sky-is-falling prognostication, it's literally what is defined by the spec and memory model of the language.

yakaccount4 · on Jan 13, 2023

> A panic that comes from a memory model violation (as one example) can change the value of anything in the memory space of the program ... This isn't a doom and gloom, sky-is-falling prognostication, it's literally what is defined by the spec and memory model of the language.

I do not think you are correct. Go has a class of unrecoverable panics for this specific reason. Go also runs deferred functions after a recoverable panic, so the notion that it's unsafe to handle it, or continue executiona after doesn't hold at all - it is literally a first-class feature of the language.

I have not seen an instance of a recoverable panic that is raised _after_ such a fatal operation. If you have an example of such, I would love to see it.

preseinger · on Jan 13, 2023

What are unrecoverable panics vs. recoverable panics? Where is that distinction defined?

yakaccount4 · on Jan 13, 2023

There seems to not be any standard list of unrecoverable panics/aborts, but this Stackoverflow post [1] has a list of a few.

As far as the user/developers are concerned, it doesn't matter too much, since you have no option to recover them, but it would be nice if it was explained if defers are still ran. I'm assuming they are not.

1. https://stackoverflow.com/questions/57486620/are-all-runtime...

preseinger · on Jan 13, 2023

If there is no way for callers to reliably distinguish recoverable panics from unrecoverable panics, then this distinction doesn't really exist, does it? Panics are panics.

yakaccount4 · on Jan 13, 2023

I'm not sure what point you are trying to make anymore.

Of course you cannot distinguish between unrecoverable and recoverable panics, because by definition an unrecoverable panic is not recoverable. There is no caller to distinguish between it - it is killed.

preseinger · on Jan 13, 2023

Oh. You're using the word panic to describe a superset of actual panics and other even more serious errors. Those things you call unrecoverable panics are not actually panics.

The point I'm trying to make is that panics are not errors by another name, and they are not safe to recover from in general.

BreakfastB0b · on Jan 11, 2023

I would agree if it weren’t super easy to cause a panic in go.

Index slice out of bounds? panic. Close a channel twice? Panic. Incorrect type assertion? Panic. Dereference nil pointer? Panic.

I would argue that all of these examples which are the most common in my experience are “goroutine scoped” because the goroutine was aborted before they potentially modified the application state in an unknown way.

It’s like not in C, or C++ where out of bounds access has now put the entire application into an unknown state.

preseinger · on Jan 11, 2023

> Index slice out of bounds? panic. Close a channel twice? Panic. Incorrect type assertion? Panic. Dereference nil pointer? Panic.

These are all really bad things which should never survive to production code. It is not difficult to detect and prevent them.

> I would argue that all of these examples which are the most common in my experience are “goroutine scoped” because the goroutine was aborted before they potentially modified the application state in an unknown way.

What makes you think that terminating the goroutine that triggered these panics prevents them from impacting the process state?

> It’s like not in C, or C++ where out of bounds access has now put the entire application into an unknown state.

What makes you think this is the case? Panics have unknowable impact, and many panics (e.g. data races) absolutely do put the program into an unknown state.

Groxx · on Jan 12, 2023

>These are all really bad things which should never survive to production code. It is not difficult to detect and prevent them.

This is equivalent to saying "out of bounds memory writes are not difficult to detect and prevent in C code". Like actually equivalent (possibly worse), not just "well if you squint they look similar".

Of course it's not hard most of the time. Being perfect is beyond hard though. And if you're not perfect, you might open the door to anything in C, or cluster-destroying rolling crashes in Go.

Sometimes shutting down every piece of your software if that happens is the correct choice, and sometimes it's so far beyond reasonable that it's ludicrous to argue in favor of "every panic is an abort".

groestl · on Jan 12, 2023

> Sometimes shutting down every piece of your software if that happens is the correct choice, and sometimes it's so far beyond reasonable that it's ludicrous to argue in favor of "every panic is an abort".

Very much this. And even for the same project: in some cases, I'm a fan of employing a quite strict error handling policy in dev environments (crash and burn) and using a more lenient approach in prod (elevated log level). In my experience, this can result in a robust product. Most importantly, this means the decision is not even made by the application programmer, sometimes it's a config thing.

preseinger · on Jan 12, 2023

Go has much stronger memory safety guarantees than C does. They aren't really comparable.

erik_seaberg · on Jan 12, 2023

x == y can panic if interface values contain incomparable fields in unexported nested structs, how would I check for that? Should we let it become a query of death and bet thousands of peers’ jobs on it never happening?

mappu · on Jan 12, 2023

Is this really the case? Can you link to anything or an example on go.dev/play/ ?

I can find a mention of "cmp.Equal" having that behavior, but that's just a third-party package panic.

assbuttbuttass · on Jan 12, 2023

It's true, but you'd never really write code like this

https://go.dev/play/p/r9NkQb6bQTx

erik_seaberg · on Jan 12, 2023

The problem also affects structs that happen to have a private map or cache or callback anywhere within.

https://go.dev/play/p/uP-vjpvuhku

preseinger · on Jan 12, 2023

Obviously `interface{}` values are not comparable?

erik_seaberg · on Jan 13, 2023

The comparison is explicitly allowed in the language spec, there’s no warning for doing it, and it often works depending on the types. It’s a data-dependent runtime error, which is usually hard to guarantee test coverage for.

preseinger · on Jan 12, 2023

Link to an example? I don't think this is true, unless you're playing stupid games with your code, which wouldn't pass code review.

ikiris · on Jan 12, 2023

don't do that?

this kind of thing is why deepcompare exists to begin with

aetherane · on Jan 12, 2023

It seems unrealistic to assume that everyone on a reasonably sized team knows all of the subtle edge cases to avoid and never makes mistakes

erik_seaberg · on Jan 12, 2023

I can nag everyone to use reflect.DeepEqual and live with some false negatives, but maps always use k1 == k2.

mappu · on Jan 23, 2023

This is days later, sorry, but - you can't use an interface as a map key, so this shouldn't apply, right?

erik_seaberg · on Jan 27, 2023

https://go.dev/ref/spec#Map_types says that is allowed.

> If the key type is an interface type, these comparison operators must be defined for the dynamic key values; failure will cause a run-time panic.

Groxx · on Jan 12, 2023

There are also significant performance and behavior differences between the two.

They are not inter-changeable, nor can one replace the other.

ikiris · on Jan 12, 2023

more specifically, it's really strange to hear of people doing equivalence checks on objects with structure. What are you expecting that comparison to do? I doubt it is doing what you think is happening, and is indeed risky of panics.

unscaled · on Jan 12, 2023

> (e.g. data races) absolutely do put the program into an unknown state. Data races do not necessarily result in panics.

Many (perhaps even most?) data races would not result in panic but just in garbled, missing, duplicate, out-of-order or otherwise incorrect data.

Data-race induced panics are generally the side-effect of a data race, not a direct protection against. They can often be inconsistent: e.g. a data race in a slice that contains a binary data format could garble a variable-length string prefix and produce an index-out-of-bounds panic. Or it could prematurely consume a shared pointer and overwrite it with nil, only to have the nil pointer dereferenced by another goroutine. These kind of panics are unpredictable.

If your application has shared global state (in-memory or even a database), it may become inconsistent due to data races. But whether data-race induced indicate irreversibly corrupted global state that requires (and can be fixed with) application restart - that is case-by-case thing.

Let's say your application has some shared state that got corrupted and the corruption triggered a panic down the line.

If your shared state is persisted in a database or some other distributed mechanism and that state got corrupted: restarting the application won't help you.

If your shared state is scoped at the HTTP request level (or whichever boundary you choose for suppressing your panics): you don't need to restart the application. The request is already terminated, along with its shared state.

Which leaves us with in-memory global state. This kind of state is generally minimized in the type of microservice and network infrastructure applications that Go is often used for.

A very small percentage of your panics will indicate corruption of such state. Will you be willing to risk service downtime in order to protect against the small possibility that the service has run into a state where its shared in-memory data became corrupted?

in memory or some other distributed mechanism and that state got corrupted: restarting the application won't help you.

flippinburgers · on Jan 12, 2023

I tend to agree with you that these are relatively easy things to detect. I see no reason for the downvotes.

Production systems should have relatively robust testing whose coverage can be increased over time. When something panics, the cause of the panic should be fixed so that the panic never happens again. Over time panics shouldn't be happening.

Then again the systems that I have relied on I have written on my own without other hands in the pot so maybe I just don't have to deal with the reality of other programmers phoning things in.

jasonhansel · on Jan 12, 2023

If your programming language handles very common errors by crashing the entire application, and if preventing these crashes is actively discouraged, then that suggests a flaw in the language itself.

This would be fine for a low-level language like C where you need to allow SEGFAULTs, but designing it into a high-level language makes no sense.

loeg · on Jan 12, 2023

Go panics should not be used for very common errors.

erik_seaberg · on Jan 12, 2023

A lot of very common operations can panic: division, dereferencing a pointer, invoking an interface method, indexing/slicing an array/slice/string, asserting the type of an interface, and converting a slice to pointer to array. It’s possible to check, but I’ve never seen a tool that verifies you never use any of these without checking. You also have to check for nil channels, though they block forever (maybe consuming a goroutine) rather than panicking.

And there are some operations where you cannot check in advance whether a panic will happen: comparing interfaces (underlying values might not be fully comparable), indexing a map (could blow up during any concurrent write), sending to a channel (might be closed), and closing a channel.

jaitsu · on Jan 12, 2023

You can recover from a panic though, so if you are implementing something that may panic you should have some sensible defer/recover in there if you can't afford to have your process crash.

preseinger · on Jan 12, 2023

Division by zero, dereferencing a nil pointer, invoking methods on a nil interface, invalid indexing of an array, unchecked type assertions -- these are not common operations! These are always easily detectable programmer errors.