Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> If there is anything you can still accomplish, don't panic, carry on.

This is often poor advice, because continuing in a corrupt state can often be worse (and harder to debug) than a clean panic. E.g. it's better to crash than to overwrite a save file with corrupt data, or transfer money to the wrong account, or show one user data belonging to another user.



If we are talking microservices I agree, take down the service. I have written a lot of safety critical stuff, and the advice for a given address space (virtual or physical) has been throw on inputs that are out of range or otherwise bad. This makes you fail fast and notice stuff, but i makes the conglomerate of address spaces as a whole be horribly unreliable, all coming down all the time. As we have bevome less monolithic with little persistent data that is shared between concerns, it has become possible to kill and restart only the offending thread, service, hardware etc. The result is a degraded functionality with an alert that it has become so. Tesla shouldn't have just stopped, they should have alerte the user that maintenance was necessary and given up on logging those errors to the now failed flash.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: