I don't understand it here. If it's detrimental to national security shouldn't it be dealt with immediate action? I think it's a propaganda with a lot of self conflicted information.
First time seeing this news thought it was a joke but turns out it's not. A few questions:
1. is it possible for a balloon to fly over pacific to U.S?
2. why we notice it after it has been at mid US? I think we could identify it starting from the west coast?
3. what makes it to be a Chinese balloon? I mean it could be from other country, right?
I tried to search answer but not finding useful information so far
For a status page to be actually independent, it needs to have all it's requirements hosted on other infrastructure. fb.com authoritative DNS is the same as facebook.com, so it's going down when (FB) DNS goes down (and DNS is going down when BGP is broken, apparently).
It looks like the status page is hosted on CloudFront though, so it got part of the way. (Of course, the other question is if it was updatable / updated during the outage)
Pardon me if it's a stupid question, but out of curiosity:
Is there any way to keep DNS up in case BGP goes down for any reason? Like a fallback nameserver hosted elsewhere/not affected by Facebook's ASs?
Is it technically impossible or did Facebook just assume something like yesterday would never happen and kept things simple instead of complicating things?
BGP didn't "go down" - they erroneously removed all routes between the Internet and several facebook internal networks via BGP. BGP was the instrument of their destruction, but not the source. Someone or something told BGP to do that; whatever that was is the cause of the issue.
At least one of those networks they accidentally removed also happened to contain the DNS servers; DNS being unavailable was a symptom - but not part of the root problem. Any focus on DNS at this point is a red herring.
Think of routes as street directions - they tell routers where to ship packets. If you erase all your addresses and directions to them from the outside world at at large, then there literally is no way for network packets to get from the global Internet to Facebooks networks (where I imagine the DNS servers were up and probably twiddling their thumbs wondering where everyone went).
An easier way to think of it - they essentially took a pair of scissors and cut the cable connections to the Internet - which is why it was so catastrophic.
They only way to mitigate that is to have an identical infrastructure managed by different tooling so a bad configuration setting from one environment wouldn't pollute the second in the same way. Not exactly an easy thing to do and might cause more other problems than it's worth. And you would have to do that for all services, not just DNS. Let's say Facebook used Cloudflare for their DNS. Great - DNS can resolve your request for fb.com to the IP address of the facebook datacenter - there still is no path for your packets to get to that facebook datacenter because they accidentally purged the routes to their networks.
It's easier to just not cut your connection to the Internet :) I'm sure there are all kinds of internal discussions picking this incident apart and formulating ways to either prevent it, or more realistically - have improved procedures to speed recovery when it inevitably happens again. BGP is not known for its inherent robustness or security. But since it's at the core of the Internet, any changes to it would have to be done on a massive internet-wide scale in perfect unison or the "cure" would be a lot worse than the current problems with it.
Murphy was indeed an optimist!
(search "Murphy's Law" for those unfamiliar with the idiom)
If it's a FB managed server, run on someone else's network, you still have a lot of the FB software risk (FB's software stack and development mantra make it easy to push changes, some of which break everything, including the ability to push further changes); even if not FB, there's a similar risk.
If it's not a FB managed server, like a 3rd party DNS provider, it's difficult to get that synchronized considering all the fun geographic loadbalancing FB is doing at the DNS level. That's generally hard once you start doing this; and it's why you don't see many dual-provider DNS setups.
Really, the status page should be not on a core domain, so that the DNS can just be external.
FB DNS breaking yesterday almost doesn't matter in the scheme of things, because the BGP breakage broke everything anyway. Would it have been a bit nicer to get http error messages instead of DNS not found messages, sure; but mostly nothing was working anyway.
It’s definitely technically possible to have secondary’s on a separate network that do zone axfr from the primary. That’s not to imply it’s trivial / easy at FB’s scale (query volume) or topology complexity (as in GSLB).
Does PSN use AWS? From the reports on twitter, it looks like it's more than just USA that's down. I'm in Canada, but see some reports of it not working in Europe as well.