Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Why GitHub is down so often? Why is it not possible to keep it up 100% of time (without counting physical failure)?. I haven't seen any down time for my system (it has hundreds of thousands of users online) in months since I have completed the setup.


I truly doubt you are running at the scale of GitHub both in terms of users, complexity, and amount of data.


It’s probably more a thing of dev velocity. If you aren’t changing anything it’s easy to keep your system operational.


I'm sure your scale is similar.


Google search is basically never down.

AWS is basically never down.

WhatsApp is basically never down.

Time for GitHub to grow up?


> Google homepage is basically never down.

The complexity difference between the Google search "app" (not counting the vast indexing infrastructure) and GitHub is also vastly different.

> AWS is basically never down.

Lol what? Have you used AWS?

> WhatsApp is basically never down.

Makes sense, Whatsapp always had a huge focus on reliable infrastructure, since day 0. Pays off I guess :)


I think you are nitpicking. My point is that companies (including Microsoft!) are capable of running large scale infra with much higher uptimes than GitHub. They want to put themselves at the center of our workflows (e.g. GitHub Actions) yet they are not delivering uptimes that are commensurate with that. What is their excuse?


Yeah, I agree with you, bit nitpicky. I also agree that they shouldn't have an excuse, besides confessing their engineering standards are not up to the level of their ambition, which is why I never make anything in my infrastructure depend on anything GitHub, everything that I use GitHub for, I have alternatives setup for the inevitable ill-timed downtime I know will happen.


Great question: Google's homepage revenue is directly 1:1 matches to its uptime. Its user retention is also loosely tied to its uptime, as the value is mostly a replaceable commodity (is Bing worse? sure, but it has results). This leads to the organization investing huge amounts of time and money in ensuring its uptime. I can recall a single outage in the past several years.

On the other hand, GitHub's revenue is mostly monthly/annual licensing and their have great stickiness as it's not trivial to migrate to an equivalent service provider (excluding minor projects who only use a couple of features). They can increase profits through feature development and cost saving, a lot more than through uptime. Is there a limit to this? Of course.


Google loses money when search is down because they can not serve ads. Does Github actually lose money when they are down? I think that because everyone is on subscription they do not lose money by the second, rather instead they lose reputation and long term they could lose customers, but Github's income isn't quite as sensitive to downtime as Googles in general, thus less investment in DevOps in comparison.


RO system is generally probably easier to keep up than a RW system that is constantly innovating.


I Think he was referring to Google Search in general. I've never witness since Google went live in 1998 any Google Search downtime. Probably happened but I can not remember it.


You don't notice when their indexers cannot write; preforming a search is basically RO.


> preforming a search is basically RO

You don't know this. Google results are not the same for all users. How do you know there isn't R/W going on, particularly when signed-in to Google?

(Unless you work at Google on search, in which case I stand corrected!)


I am certain the are normally writes going on; they do run Analytics on their homepage. However they get to defer, retry and play lots of eventually consistent tricks, worst case just swallow the exceptions. The fact they can make the service _seem_ to the end user as fully working whilst being unable to write is a major factor in achieving their world beating reliability.


It is still a relatively complex multi-machine RO operation. It isn't like serving a static site.


Sure, but they can have several copies of the index per datacenter, retry your query multiple times posibly even in a diffrent datacenter. New code and even updated indexes can be tried and then fall back to yesterday's version.


Google Search is still a RO system - you are mostly just retrieving information from a search index.


It is read-online but its read systems are relatively complex and work at scale.


How do you know that?


I don't think GitHub's homepage has gone down at any point during this outage either.


ummm, I guess scale is similar. I am just a single person vs an organization. My google knowledge vs industry expert with years of experience.

My point was not about similar scale though. How hard is it to keep a system up? AWS is a whole universe compared to GitHub, yet it doesn't go down as often as GitHub.


Only GitHub truly knows. But everyone here knows that since Microsoft acquired it, it has degraded to the point where it goes down every month.

It is so frequent and unreliable, you just might as well self-host at this point. You would likely have better up time than GitHub over the past three years since this prediction. [0]

[0] https://news.ycombinator.com/item?id=22867803


I find this strange too. GH seems to have more major incidents lately...


ChatGPT overloading it... (scanning repos)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: