Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Mistakes happen at all companies. I’m not sure it’s possible at all to have 100% uptime forever. Since it’s not possible (or at the very least, extremely unlikely) to never make a mistake, isn’t it much better to try to understand each other’s mistakes?

Certainly, cloudflare, could be lying to us. First, this seems super unlikely to me, but in the event that they are, it’s still a situation which would literally cause the described issue if it did happen. Therefore it can still be learned from in the same way. Again, it feels unlikely for it to be a lie considering the specificity and lack of other viable explanations.



There are infinite amount of ways to break something.

Look, most of mistakes are silly or combination of silly. We think it is good to understand them, but in reality someone on the team probably pointed out that this can happen and was ignored, as it wasn’t a priority. And the biggest motivator to make sure companies prioritize uptime is to tell them, that we don’t care about their excuses, we care about uptime.

I wonder whether github still on MySql waiting for another outage.

Read the reason - human error. As old as humans, all I learned is that Cloudflare doesn’t have sufficient automations and checks in place.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: