Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Hacker News moved off AWS again at 10:17 PM PST
145 points by Amfy on Aug 15, 2022 | hide | past | favorite | 113 comments
previously AWS IP 50.112.136.166, now M5 hosting IP 209.216.230.240

Glad to see it back on bare metal. Thanks HN team for all the hard work!

(Yes, I set up monitoring for it haha)



Still running FreeBSD? (Switching hosting likely didn’t warrant changes OSes, just curious).


Yes.


Wow nice, I had no idea


Why does HN run such a low resource site on a single bare metal server (plus identical backup) in the first place? 2x E5-2637 v4 is not that much CPU power these days.

An IaaS VM from the same vendor (M5 Hosting) would provide equivalent resources with much higher reliability. What benefit is 2 bare metal servers providing over a single properly sized VM in a managed resource pool with SAN backed storage?

I don't think it saves any money, and when those bare metal servers suffered identical hardware failures from a firmware bug HN was down for most of a day and temporarily migrated to an IaaS provider anyway (AWS).


I wonder if HN's single threaded nature has anything to do with it. Eight cores or 16 threads don't do much good when your bespoke Arc Lisp stack is single core limited.


I kind of wonder if it's somehow not thread safe, or not talking to a database; or if not, why they don't just run parallel copies


I can't find an authoritative source, but I've read that HN uses text files stored in a filesystem rather than a database, so that might be why?


>much higher reliability

[citation needed]. Sure, it solves some reliability problems, but introduces others, and I’m not sure if it’s a net win.


Agree. Unless a post-mortem analysis of previous reliability "issues" determines that the benefit justifies the cost, it's not worth it. 99.99% uptime for zero incremental cost vs 99.9999% uptime and 100k USD.


OP does sound like a salesperson.


Don't fix it if it isn't broken


It was and has been broken. A few times.


And lessons have been learned. Still far better uptime than most websites with huge resources thrown at them.


A 10x more complex redundant (or "redundant") system often breaks faster (and definitely stays down longer) than a simple direct system.

Many people just don't consider failure scenarios. Offsite live database backups, for example, are a great idea. Say ... how does your site perform, in percent of normal QPS, when the database is now 150ms away instead of 1ns? 1% ... that's not redundancy, despite the site being up, let's just call that a failure.

And people forget one thing about hosting on AWS. Say ... when AWS is slow/has problems/blocked on firewall/down ... when your competitor is down, would you like your site to be up? How about vice versa?


> Offsite database backups, for example, are a great idea

“Off-site database backups” that mean you now have a 150ms round trip for user facing queries? … what on earth are you talking about.

Is your argument that “simple” things are better because “you can do stupid things and blame it on complexity”?


The database had a fallback that, according to good practice, was hosted in a database in a different city, another provider (different country actually, but this is Europe, it wasn't actually that far in km, it was, however >100ms away).

Because they had a really fast local database essentially all the time, every pageview started requiring more and more database queries, some 50 for the front page alone, as the developers added features.

Then the database needed to failover. And the complexity hadn't actually killed IT (yet): it actually worked ... But of course 50 * 2 * 150ms = 15000, or 15 seconds per page.

I'm saying simple things can be better even when they don't provide redundancy because there's a bunch of problems that increase complexity so much that you can literally fix a simple problem faster than redundancy can take care of it.


Your comment reads like a sales pitch for RDS. We have failover replicas in different geographically distributed datacenters. Failovers happen more or less instantly and the added latency (~0.5ms) is fine.

So for us this doesn't increase complexity, it greatly reduces it while increasing availability and general confidence, even though the underlying system (RDS/Aurora) is clearly very complex.

If you're running a tinpot site with a single developer on a couple of pet servers, then fair enough. But it's definitely not correct to say that a simple, direct system is the epitome of reliability. It's not.


I don't understand this. Actually I use SQLite linked in to the site code itself. Very tough to beat on a whole host of metrics. Shared data is spread like configuration.


I don't run a tinpot site but I am the sole developer managing a couple web servers and services for my company. Our db failure works much like you described. Maybe a .5ms added latency if that regardless of physical location more like .1ms from our metrics.

Every quarter we test it, and without fail, it has worked. So I added maybe a couple hours of research and two minutes of additional configurations during set up for reliably fast failover. Seems pretty no brainer to me.

When we moved to a managed instance with our cloud provider it took even less time to set up failover, maybe 30 seconds.

With the numerous options for cloud offerings these days I see no reason to not have a failover set up whether you are a massive corporation or a small business.


Would love to see the general user traffic patterns for HN



Related: The Best Time to Post on Hacker News (2019) [1]

[1] https://blog.rmotr.com/the-best-time-to-post-on-hacker-news-...


Yes, would be cool if they used something like Plausible (or fugu.lol, shameless plug, sorry) and included a public link to their stats in the footer.


mind sharing how much traffic (TB or Mbit/s) this site does?


I’d bet it’s a lot less than you might think. Text compresses really well even with a simple method like deflate.


Without downtime too[0], nice!

[0]: https://hackernews.onlineornot.com/


Don't know if this is tongue in cheek or not but I get messages that say they can't serve requests all the time. Service might not be entirely down but it definitely isn't serving all traffic more often than that.


It's a best effort approximation - this status page is automated via external uptime monitoring.

It won't catch a 1 in 10000 5xx error, but it'll catch the whole site being unavailable for several minutes.


There have been several occasions recently where it is in read only mode.


I saw that it was in read only mode for a while. Couldn't have been more than a couple of minutes though.


Looking forward to ipv6!


It's on their list fortunately. Until then you can derive and use the IPv6 address for their old Cloudflare endpoint. I pinged them to update the backend IP for it after the outage switched to a new server. Might need to do that again now.


> It's on their list fortunately.

Do you know why they fail to implement ipv6 for years? I mean it's shame when tech-oriented website/service can't keep up with new technologies.

> Until then you can derive and use the IPv6 address for their old Cloudflare endpoint.

Would be cool if you could share more information how to do that.


> Do you know why they fail to implement ipv6 for years?

Little motivation to make changes and their anti-spam systems only handle IPv4.

> Would be cool if you could share more information how to do that.

Put the bytes of the IPv4 Cloudflare endpoint at the end of a Cloudflare IPv6 Anycast prefix and voila. Fastly is similar but you take the bytes containing the site ID in the IPv4 address.

Also seems they didn't forget to update Cloudflare this time around. :)

HN Cloudflare IPv4: 104.16.104.110

HN Cloudflare IPv6: 2606:4700::6810:686e


I've always assumed it has something to do with HN being written in its own programming language.


So it's written in Lisp, am I right?



Why is switching to ipv6 something to look forward too? What are the benefits?


Faster global routing due to simpler routing tables.


> What are the benefits?

Being able to access the site from my IPV6 only devices?


Until this comment I didn't even know they exist. If you don't mind, what type of device are they and why are they v6 only?


> what type of device are they

Almost all of my computers, including my phone, are behind wireguard on a globally routed /64 IPV6 virtual network.

It's a bit of a pain for some sites who do not offer a V6 addie via DNS, but it's extremely flexible and offer tons of other advantages.

Specifically, NAT is basically a thing of the past and any of my devices can talk to all of my other devices by establishing a simple TCP connection or shooting a UDP packet at them.

I can also access all of my devices from wherever I am connected to the internet, as long as the device has a globally routed V6 addie.


>as long as the device has a globally routed V6 addie

I was waiting for the catch, and I was not disappointed.


One badly behaving device and your whole network is a screaming blip on the radar that is easily tracked.

Any internet device of mine would immediately go into a quarantine subnet. It is a feature not a bug.


No catch: my phone is on VPN and therefore has full V6 connectivity.


If you don’t mind, how did you set this up? I’d love to play around with this sort of thing but I’m a bit of a networking newbie.


This sounds like the 90s when not using a router.

Don’t you need a basic router/nat to protect your systems?


> Don’t you need a basic router/nat to protect your systems?

You are under the mistaken impression that your router / NAT protects you.

It doesn't. It may mitigate some of the most basic attacks, the ones what were cutting edge in the 90s.


It does by not exposing the ports that aren't meant to be and only routing the traffic intended down to the local network.


> Don’t you need a basic router/nat to protect your systems?

No, not really. It's no longer the 90s so tcp/ip stacks aren't easily crashed. And it's no longer the 90s, so no services are listening by default or it's say openssh which isn't easily crashed either (you may want to consider if you want to accept passwords via ssh though).

Additionally, decent OSes will rate limit responses to pings and SYNs and what not, so you won't be a good reflector out of the box.


You could have a VNC in LAN without password but forgot to limit the source IP.

And the way you say, you need a "decent OS" to avoid flood attacks without tinkering, whichever OS that is.


> You could have a VNC in LAN without password but forgot to limit the source IP.

Sure, but that's not by default. You've got to take affirmative steps to enable that; although it's certainly easier to listen without limiting the source than to do it right.

> And the way you say, you need a "decent OS" to avoid flood attacks without tinkering, whichever OS that is.

Yeah, I just don't know for sure what's decent. I have no problem putting FreeBSD out on the internet without a firewall, and I think Linux would be ok too; but I wouldn't put MacOS if it's got any tcp listening ports, because it can be easily SYN flooded, and I'm not sure off hand if it has ICMP limits. If you put Windows on the internet and tell it it's a 'public' network, it'll run a firewall and you should probably be pretty ok (again, as long as you don't misconfigure applications)


How many humans browse internet with IPv6 only devices?


Most mobile phones are v6 only and then go through increasingly unreliable CGNAT setups for v4


So they are not IPv6 only if I understand well. Why a CGNAT would be unreliable?


Yet another over-subscribed device to rewrite packets between you and your destination. Spontaneous connection failures due to port exhaustion or overly aggressive connection timers/recycling. Lack of public to private connectability due to absence of port mapping, or any way to influence the configuration of your carrier’s device.


Unfortunately, AT&T doesn't give my phone a v6 address.


> How many humans browse internet with IPv6 only devices?

If you browse the net on your phone, it's likely you already do or sit behind some kludged up NAT situation which - among other things - severly curtails your freedom to interact with other devices on the internet.

And things aren't going to improve in that regard given the shortage of V4 addies.


True, but it's not IPv6 only devices.


You might have missed the "or" word in my sentence.



Interesting, so one stubborn guy in USA who decided to sell broken internet access to its customers. I'm sure they are happy, if this story is true.


Easier to bypass bans.


not really. By banning the whole /64 prefix, you get the same effect that you got from banning a single IPv4 address.


You have to be careful about that, not everyone hands out /64. They can be as small as /128 and I've seen some providers give out as large as /48.


/48 is the recommendation for home users now.

My ISP (Aussie Broadband) follows that recommendation and provides me with a /48 that I can break into multiple /56s or /64s.


I'd prefer a much smaller range but being able to request as many as I want via DHCP (or equivalent mechanism). That way it wouldn't be contiguous so I wouldn't feel as much of a need to use a VPN for privacy. As it is, what's the point of handing me an entire /48 if I just end up forcing most of my traffic through a single IPv4 address with a VPN for most of my web browsing anyway?

Although to be fair even with non-contiguous address space I might still want a VPN since ISPs in the US are allowed to sell your browsing history.

Also if I'm hosting a public facing service at home I'm going to proxy it via wireguard through a VPS I rent for obvious security reasons. I don't actually want public facing services directly exposed from my home network and I have to question the sanity of anyone who says they do.

And I've always disabled webrtc for obvious privacy (ie network fingerprinting) reasons. What's so great about getting rid of NAT again?


Unless the person uses t-mobile, which puts a lot of people on very small IP blocks, which is such a huge logistical nightmare for enforcing bans. https://news.ycombinator.com/item?id=32038215


Why? Running HN can’t be that expensive, no?


Running on AWS is stupid if you have predictable traffic patterns and don’t need massive scale at the click of a button. It’s like renting an office in downtown Manhattan that you only use to receive mail.


This - serverless bills are astronomical for consistent traffic, even EC2 charges a premium over regular hosting.


"Premium" is a gross understatement. Depending on traffic pattern and whether app is traffic-heavy or CPU-heavy, it can be from 5x to over 100x more expensive.


Ironically this is exactly how we use AWS at work.

Some exec decided we must be modern and all cloud. So they closed our datacenters and moved everything to AWS as-is. Including the process to request new "servers" with 10 tabs of complex Excel forms. They just replaced the rack location tab with one about AWS regions lol. It's as static as it can be. We even have the old lead-time for hardware emulated in the cloud now because of the complex approval process taking up similar time.

When you take the agile and auto scaling out of the cloud you just end up paying more. But I'm done telling them they're holding it wrong... Nobody wants to know.

I hope being modern was worth it. /s


>They just replaced the rack location tab with one about AWS regions lol

As a first swipe ... this is smart - stepping your way into change is often better than one swell foop

However, if you never revisit it / plan to improve and do things The New Way(tm), it's stupid

Of course, Microsoft, Google, Oracle, Amazon, etc aren't going to do you any favors and show you how to use cloud computing more effectively than dedicated hardware unless you ask them to: they're just providing a service; if you want to pay them $100,000 for something you could do for $1750...who are they to argue?


I agree it would be a good stepping stone.

But it doesn't seem like we are stepping anywhere. For example, if there was a clear vision to a real cloud-based company, I would have at least made facilities to to do new things The New Way rather than forcing everything to be the old way. Which is what they're doing. I needed a beefy server do run a big datacrunch periodically, using only minimal software which I can easily autodeploy for every crunch, run it a few hours and kill it till the next time. Sounds like a great option to use cloud, right? Kubernetes would be a bit heavy for this but I could totally ansible this.

However there seems to be no way in our org to get a server I can simply spin up when I need it. It needs to be requested in advance in threefold, internal billing agreed etc, and even if I turn it off we'll still be billed for it because I can't destroy it. Destroying and spinning up a new one would mean going through the entire rigmarole again :P

The funny thing is, we always do this. By the time we actually go full-on cloud, the world will already have moved on and be on the next great thing, and our neat cloud setup will be all deprecated tech.


I've had the [mis|good]fortune of working with some customers like your company - the best advice I can give you is to find a place that isn't so mired in bureaucratic idiocy :P


Why? It's always advisable to work with a customer who has more money than brains.


Work with a customer with more money than brains?

Sure

Work for a company with money than brains?

Ideally not :)


The company you work for is your customer.


I had a terrible case of a client spending $6000 per month on a visibly innocent things and refused to move it into dedicated because he could not believe it can bring any savings because "clouds are cheap". Until i just did it at my own expense because with my retainer, a server cost me something like 1.5-2 hours of work per month equivalent.

By the way, it's only in the US. In Europe, people are a lot less zombified and susceptible to advertisement/brand image.


> Some exec decided we must be modern and all cloud.

And now this exec has this "success story" in his CV and either got a promotion or jumped ship to a better role at another company. It's a tried and true strategy for the ridiculous corporate rat race.


Consider the use case of the enterprise and larger scaled out services. Very few VPS providers will give you guarantees on isolated data centers in each region (availability zones) along with the huge list of compliance programs AWS is certified for. On top of that servers are only a small piece of the AWS universe. Application services (Storage, managed replicated DBs across AZs, managed APIs, streaming, IoT, CDN, identity management...) , and serverless containers and functions are mostly unavailable on small VPS providers.

As a solo developer, you don't encounter problems of scaling your engineering team and hiring devops engineers because you are doing everything yourself. The moment you have to hire expensive developers/devops all this starts making sense.

But even for solo/smaller accounts, many people are using AWS as a jump host with a programmable whitelist to reach their VPS boxes. Why? Because AWS security is likely best in the world - they spend massive amounts of money on this and many banks and even government services run on them.

(And even if you did us it only to scale servers, you don't need to click a button :) AWS autoscaling is finely programmable on all kind of application metrics even custom ones. Or by time of day: many teams routinely scale down their UAT and DEV clusters at night and weekends to zero with an autoscaling rule, more than halving server costs.)


Anyone with somewhat predictable load patterns who’s using the cloud and expects to stay around for a few years should switch to “reserved instances” to reduce costs. It’d be a very poorly thought out implementation that doesn’t consider reserved instances.

Reserved instances may or may not be competitive with other providers. But they’d check the CTO’s or CIO’s lists on “we’re on the cloud” and save some money.


Reserved instances are still incredibly expensive compared to bare metal AND require long term commits which bare metal typically doesn’t.


AWS LightSail is fairly cheap and reserved.


I agree with your point about AWS - but interestingly there's plenty of "prestige" businesses (finance, law) that rent desirable office space solely to get the address, and do most of their back office stuff elsewhere.


I disagree. I use my AWS EC2 instance as a VPS. For the money I get terrific customer support, access and integration with a tons of other useful services (RDS, S3), shared billing with my Glacier backups, and experience with a technology that employers value. I'm very happy, especially with the customer service.


If you use low bandwidth, EC2 is reasonably priced compared to cheaper alternatives such as Linode, DO, etc.

If you use a fair amount of bandwidth, AWS is outright a scam.

Of course bare metal is much cheaper.

In my previous job (gaming), colocation was literally 20x cheaper than using AWS EC2.


It just makes no sense to use AWS. You sleep better knowing there aren't extra charges you can't expect on sudden egress traffic rise by choosing sane providers.

The only place it's good for is for backups where traffic is almost one way.


Good point. As a server mostly for myself (but also hosting a few low-bandwidth sites such as my personal site), bandwidth is minuscule. I should have clarified that.


Does a VPS really have that high of a customer service requirement for that to be the primary purchasing criteria?

Besides, there are plenty of VPS providers with excellent customer service.


It does when my credit card was cancelled, and AWS deferred payment for three months. I doubt that many other providers would have been so understanding and so generous.


Other providers solve this by providing multiple payment options than just "cards".


Without a working credit card for three months, I would have been unable to begin the process of another payment option.

And it's not specifically the "I didn't have a card" problem that was solved. It was the far more general "I had an issue that required human intervention and sacrifice on behalf of the provider", which was resolved. I can think of zero other tech giants with that level of service.


Maybe HN doesn't need all of that? Where/how HN is hosted is the least interesting thing about the orange site IMO.


True, but then lots of tech nerds are on here who like to look behind the scenes


Many other providers offer the useful services you'd ever need for a smaller company (Linode for example also offers managed databases and object stores) at a fraction of the cost.


You probably also don't have the amount of egress traffic that HN does.


HN can't have that much egress traffic

Even if you have 100,000 simultaneous users egressing 10k/minute (which seem implausibly high, but usable for a quick approximation), you're only looking at 40 TB/mo

And that has to be off by at least an order of magnitude (if not 2 or three)

There's no reason, from a bandwidth perspective, you can't host HN on $100 worth of DO droplets and some Object storage/hosted db

If HN cost even $500/mo in hosting, I'd be quite surprised


terramex[0] posted[1] this link - https://news.ycombinator.com/item?id=28479595 wherein dang[2] links to https://news.ycombinator.com/item?id=16076041, itself a comment from sctb[3] saying they're running 4M requests a day from a Xeon E5-2637 running FreeBSD with mirrored SSD for the site, and mirrored spinny for logs

That post doesn't say how much memory they were running in 2018 (I'll guess 32G, which is probably about right for a dual-core CPU or that era[4]), but an as-close-to-similarly-specced phsyical box from Hetzner[5] (6 core, 64G RAM, mirrored 1TB NVMe, mirrored 2TB spinny) is currently € 64.26 per month (68+change in Germany vs Finland) with 20TB transfer per month (if you add-on the 10G uplink, unlimited (within fair use policies) on the 1gbit connection)

With 4M requests/day (let's even triple that to 12M/day now), at 10k per request (a ridiculously high guess, I'm sure), that's only 40-120GB/day tops in egressed data (or 1.2-36TB per month)

HN's a pretty cheap thing to run, as far as hardware and hosting is concerned :)

------------

[0] https://news.ycombinator.com/user?id=terramex

[1] https://news.ycombinator.com/item?id=32466789

[2] https://news.ycombinator.com/user?id=dang

[3] https://news.ycombinator.com/user?id=sctb

[4] https://ark.intel.com/content/www/us/en/ark/products/64598/i...

[5] https://www.hetzner.com/dedicated-rootserver/ax51-nvme/confi...


Yes, I'm sure that for HN bandwidth is one of the most important factors. For myself it's almost an afterthought. And bandwidth is a critical fulcrum for cloud pricing comparisons.


What do you use customer support for? Do you have that much trouble?


This isn't really a debate is it ? You like it for reasons other than value. OK, but its more expensive than other vps providers.

Customer service : if you need to use cusomter service something has gone wrong, and I literally move provider.


I shudder to think that there are people that would use the word cheap and RDS in the same sentence.


Or running your business at WeWork locations.

Enterprise public cloud like AWS really doesn't make financial sense for most large enterprises or small organisations. If you have enough scale there's no need to pay the premium. If you don't need the flexibility you don't need to pay the premium. Only those in are stuck in the middle or have use cases that need extreme 'elasticitiy' or 'agility' really benefit from using this model.


You can lease capacity for significant savings.


i think (hope) they take pride in hosting this off any of the big clouds.


Would love to see a blog post on this topic from HN.


Nothing is wrong with hosting on bare metal. However, this site doesn't host images or videos, tracking for Internet ads, etc. Instead, it is a text-driven old-school site (I believe hosted on FreeBSD). Must be light on resources. It is super fast. I am sure they have some caching in place too. HN's business model is different from typical websites. They are part of YC, which is how it can sustain itself.


Could HN's value be measured in dollars per eighty column character line of text and the sourceware to presentation stack likewise? I've seen numbers floated in the range $1000 to $400 per line for mixed critical sourceware platform such as seL4 microkernel.


We should lobby HN admins to disclose 1) traffic patterns 2) hardware information and specs 3) database size or caching arch

Would be cool to know how this thing is run.


@dang posts it from time to time, I think this is most recent one: https://news.ycombinator.com/item?id=28479595

Last month HN had issues with SSD's suddenly dying after 40000 hours and moved to AWS temporarily: https://news.ycombinator.com/item?id=32031639


they previously did: https://news.ycombinator.com/item?id=16076041

but i would also like to know how much traffic in terms of traffic volume (TB and Mbit/s peak) they do.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: