Why GitHub is down so often? Why is it not possible to keep it up 100% of time (without counting physical failure)?. I haven't seen any down time for my system (it has hundreds of thousands of users online) in months since I have completed the setup.
I think you are nitpicking. My point is that companies (including Microsoft!) are capable of running large scale infra with much higher uptimes than GitHub. They want to put themselves at the center of our workflows (e.g. GitHub Actions) yet they are not delivering uptimes that are commensurate with that. What is their excuse?
Yeah, I agree with you, bit nitpicky. I also agree that they shouldn't have an excuse, besides confessing their engineering standards are not up to the level of their ambition, which is why I never make anything in my infrastructure depend on anything GitHub, everything that I use GitHub for, I have alternatives setup for the inevitable ill-timed downtime I know will happen.
Great question: Google's homepage revenue is directly 1:1 matches to its uptime. Its user retention is also loosely tied to its uptime, as the value is mostly a replaceable commodity (is Bing worse? sure, but it has results). This leads to the organization investing huge amounts of time and money in ensuring its uptime. I can recall a single outage in the past several years.
On the other hand, GitHub's revenue is mostly monthly/annual licensing and their have great stickiness as it's not trivial to migrate to an equivalent service provider (excluding minor projects who only use a couple of features). They can increase profits through feature development and cost saving, a lot more than through uptime. Is there a limit to this? Of course.
Google loses money when search is down because they can not serve ads. Does Github actually lose money when they are down? I think that because everyone is on subscription they do not lose money by the second, rather instead they lose reputation and long term they could lose customers, but Github's income isn't quite as sensitive to downtime as Googles in general, thus less investment in DevOps in comparison.
I Think he was referring to Google Search in general. I've never witness since Google went live in 1998 any Google Search downtime. Probably happened but I can not remember it.
I am certain the are normally writes going on; they do run Analytics on their homepage. However they get to defer, retry and play lots of eventually consistent tricks, worst case just swallow the exceptions. The fact they can make the service _seem_ to the end user as fully working whilst being unable to write is a major factor in achieving their world beating reliability.
Sure, but they can have several copies of the index per datacenter, retry your query multiple times posibly even in a diffrent datacenter. New code and even updated indexes can be tried and then fall back to yesterday's version.
ummm, I guess scale is similar. I am just a single person vs an organization. My google knowledge vs industry expert with years of experience.
My point was not about similar scale though. How hard is it to keep a system up? AWS is a whole universe compared to GitHub, yet it doesn't go down as often as GitHub.
Only GitHub truly knows. But everyone here knows that since Microsoft acquired it, it has degraded to the point where it goes down every month.
It is so frequent and unreliable, you just might as well self-host at this point. You would likely have better up time than GitHub over the past three years since this prediction. [0]