Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Complexity? I've never set up a highly available Postgres and Redis cluster on dedicated hardware, but I can not imagine it's easier than doing it in AWS which is only a few clicks and I don't have to worry about OS upgrades and patches. Or a highly available load balancer with infinite scale.


This is how the cloud companies keep you hooked on. I am not against them of course but the notion that no one can self host in production because "it is too complex" is something that we have been fed over the last 10-15 years. Deploying a production db on a dedicated server is not that hard. It is about the fact that people now think that unless they do cloud, they are amateurs. It is sad.


I agree that running servers onprem does not need to be hard in general, but I disagree when it comes to doing production databases.

I've done onprem highly available MySQL for years, and getting the whole master/slave thing go just right during server upgrades was really challenging. On AWS upgrading MySQL server ("Aurora") is really just a few clicks. It can even do blue/green deployment for you, where you temporarily get the whole setup replicated and in sync so you can verify that everything went OK before switching over. Disaster recovery (regular backups to off site & ability to restore quickly) is also hard to get right if you have to do it yourself.


If you are running k8s on prem, the "easy" way is to use a mature operator, taking care of all of that.

https://github.com/percona/percona-xtradb-cluster-operator https://github.com/mariadb-operator/mariadb-operator or CNPG for Postgres needs. They all work reasonable well, and cover all the basic (HA, replication, backups, recovery, etc).


It's really hard to do blue/green on prem with giant expensive database servers. Maybe if you're super big and you can amortize them over multiple teams, but most shops aren't and can't. The cloud is great.


Doing stuff on-prem or in a data centre _is_ hard though.

It's easy to look at a one-off deployment of a single server and remark on how much cheaper it is than RDS, and that's fine if that's all you need. But it completely skips past the reality of a real life resilient database server deployment: handling upgrades, disk failures, backups, hot standbys, encryption key management, keeping deployment scripts up to date, hardware support contracts and vendor management, the disaster recovery testing for the multi-site SAN fabric with fibre channel switches and redundant dedicated fibre, etc. Before the cloud, we actually had a staff member who was entirely dedicated to managing the database servers.

Plus as a bonus, not ever having to get up at 2AM and drive down to a data centre because there was a power failure due to a generator not kicking in, and it turns out the data centre hadn't adequately planned for the amount of remote hands techs they'd need in that scenario...

RDS is expensive on paper, but to get the same level of guarantees either yourself or through another provider always seems to end up costing about the same as RDS.


I have done all of this also, today I outsource the DB server and do everything else myself, including a local read replica and pg_dump backups as a hail mary.

Essentially all that pain of yonder years was essentially storage it was a F**ing nightmare running HA network storage before the days of SSDs. It was slower than RAID, 5X more expensive than RAID and generally involved an extreme amount of pain and/or expense (usually both). But these days you only actually need SANs or as we call it today block storage when you have data you care about, again you only have to care about backups when you have data you care about.

For absolutely all of us the side effect of moving away from monolithic 'pets' is that we have made the app layer not require any long term state itself. So today all you need is N X any random thing that might lose data or fail at any moment as your app servers and an external DB service (neon, planetscale, RDS), plus perhaps S3 for objects.


Database is one of those places where it's justified, I think. Application containers do not need the same level of care hence are easy to run yourself.


I guess that is the kicker right? "same level of guarantees".


I'd much rather deploy cassandra, admittedly a complex but failure resistant database, on internal hardware than on AWS. So much less hassle with forced restarts of retired instances, noisy nonperformant networking and disk I/O, heavy neighbors, black box throttling, etc.

But with Postgres, even with HA, you can't do geographic/multi-DC of data nearly as well as something like Cassandra.


> I've never set up a highly available Postgres and Redis cluster on dedicated hardware, but I can not imagine it's easier than doing it in AWS which is only a few clicks

It's "only a few clicks" after you have spent a signficant amount of time learning AWS.


As a self hosting fan, i cant even fathom how hard it would be to even get started running a Postgres or redis cluster on AWS.

Like, where do I go? Do i search for Postgres? If so where? Does the IP of my cluster change? If so how to make it static? Also can non-aws servers connect to it? No? Then how to open up the firewall and allow it? And what happens if it uses too much resources? Does it shutdown by itself? What if i wanna fine tune a config parameter? Do I ssh into it? Can i edit it in the UI?

Meanwhile, all that time finding out, and I could ssh into a server, code and run a simple bash script to download, compile, run. Then another script to replicate. And i can check the logs, change any config parameter, restart etc. no black box to debug if shit hits the fan


Having lived in both worlds, there are services wherein, yeah, host it yourself. But having done DB on-prem/on-metal, dedicated hosting, and cloud, databases are the one thing I'm happy to overpay for.

The things you describe involve a small learning curve, each different for each cloud environment, but then you never have to think about it again. You don't have to worry about downtime (if you set it up right), running a bash script ... literally nothing else has to be done.

Am I overpaying for Postgres compared to the alternatives? Hell yeah. Has it paid off? 100%, would never want to go back.


> Do i search for Postgres?

Yes. In your AWS console right after logging in. And pretty much all of your other setup and config questions are answered by just filling out the web form right there. No sshing to change the parameters they are all available right there.

> And what happens if it uses too much resources?

It can't. You've chosen how much resources (CPU/Memory/Disk) to give it. Run away cloud costs are bill by usage stuff like redshift, s3, lambda, etc.

I'm a strong advocate for self (for some value of self) hosting over cloud, but your making cloud out to be far more difficult than it is.


Actually... for Postgres specifically, it's less than 5 minutes to do so in AWS and you get replication, disaster recovery and basic monitoring all included.

I hated having to deal with PostgreSQL on bare metal.

To answer your questions should someone ask these as well and wish answers:

> Does the IP of my cluster change? If so how to make it static?

Use the DNS entry that AWS gives you as the "endpoint", done. I think you can pin a stable Elastic IP to RDS as well if you wish to expose your RDS DB to the Internet although I have really no idea why one would want that given potential security issues.

> Also can non-aws servers connect to it? No?

You can expose it to the Internet in the creation web UI. I think the default the assistant uses is to open it to 0.0.0.0/0 but the last time I did that is many years past so I hope that AWS asks you about what you want these days.

>Then how to open up the firewall and allow it?

If the above does not, create a Security Group, assign the RDS server to that Security Group and create an Ingress rule that either only allows specific CIDRs or a blanket 0.0.0.0/0.

> And what happens if it uses too much resources? Does it shutdown by itself?

It just gets dog slow if your I/O quota is exhausted, it goes into an error state when the disk goes full. Expand your disk quota and the RDS database becomes accessible again.

> What if i wanna fine tune a config parameter? Do I ssh into it? Can i edit it in the UI?

No SSH at all, not even for manually unfucking something, for that you need the assistance of the AWS support - but in about six years I never had a database FUBAR'ing itself.

As for config parameters, there's an UI for this called "parameter/option groups", you can set almost all config parameters there, and you can use these as templates for other servers you need as well.


This smells like “Dropbox is just rsync”. No skin in the game I think there are pros and cons to each but a Postgres cluster can be as easy as a couple clicks or an entry into a provisioning script. I don’t believe you would be able to architect the same setup with a simple single server ssh and a simple bash script. Unless you already wrote a bash script that magically provisions the cluster across various machines.


> As a self hosting fan, i cant even fathom how hard it would be to even get started running a Postgres or redis cluster on AWS. Like, where do I go? Do i search for Postgres? If so where?

Anything you don't know how to do - or haven't even searched for - either sounds incredibly complex, or incredibly simple.


It is not as simple as you describe to set up HA multi-region Postgres

If you don't care about HA, then sure everything becomes easy! Until you have a disaster to recover and realize that maybe you do care about HA. Or until you have an enterprise customer or compliance requirement that needs to understand your DR and continuity plans.

Yugabyte is the closest I’ve seen to achieving that simplicity with self host multi region and HA Postgres and it is still quite a bit more involved than the steps you describe and definitely more work than paying for their AWS service. (I just mention instead of Aurora because there’s no self host process to compare directly there as it’s proprietary.)


Did you try ChatGPT for step by step directions for an EC2 deployed database? It would be a great litmus test to see if it does proper security and lockdown in the process, and what options it suggests aside from the AWS-managed stuff.

It would be so useful to have an EC2/S3/etc compatible API that maps to a homelab. Again something that Claude should allegedly be able to vibecode give then breadth of documentation, examples, and discussions on the AWS API.


Your comment seems much more in the vain "I already learned how to do it this way, and I would have to learn something to do it the other way"

Which is of course true, but it is true for all things. Provisioning a cluster in AWS takes a bit of research and learning, but so did learning how to set it up locally. I think most people who know how to do both will agree it is simpler to learn how to use the AWS version than learning how to self host it.


A fun one in the cloud is "when I upgrade to a new version of Postgres, how long is the downtime and what happens to my indexes?"


For AWS RDS, no big deal. Bare metal or Docker? Oh now THAT is a world of pain.

Seriously I despise PostgreSQL in particular in how fucking annoying it is to upgrade.


Yep. I know folks running their own clusters on AWS EC2 instead of RDS. They're still on 3 or 4 versions back because upgrading Postgres is a PITA.


If you can self host postgres, you'll find "managing" RDS to be a walk in the park.


If you are talking about RDS and ElasticCache, it’s definitely NOT a few clicks if you want it secure and production-ready, according to AWS itself in their docs and training.

And before someone says Lightsail: is not meant for highly availability/infinite scale.


> I've never set up a highly available Postgres and Redis cluster on dedicated hardware, but I can not imagine it's easier than doing it in AWS which is only a few clicks and I don't have to worry about OS upgrades and patches

Last I checked, stack overflow and all of the stack exchange sites are hosted on a single server. The people who actually need to handle more traffic than that are in the 0.1% category, so I question your implicit assumption that you actually need a Postgres and Redis cluster, or that this represents any kind of typical need.


SO was hosted on a single rack last I checked, not a single box. At the time they had an MS SQL cluster.

Also, databases can easily see a ton of internal traffic. Think internal logistics/operations/analytics. Even a medium size company can have a huge amount of data, such as tracking every item purchased and sold for a retail chain.


They use multiple servers for redundancy, but they are using only 5-10% capacity per [1], so they say they could run on a single server given these numbers. Seems like they've since moved to the cloud though [2].

[1] https://www.datacenterdynamics.com/en/news/stack-overflow-st...

[2] https://stackoverflow.blog/2025/08/28/moving-the-public-stac...


If you don’t find AWS complicated you really haven’t used AWS.


If you were personally paying the bill, you'd probably choose the self host on cost alone. Deploying a DB with HA and offsite backups is not hard at all.


I have done many postgres deploys on bare metal. The IOPS and storage space saved (zfs compression because psql is meh) is huge. I regularly used hosted dbs but largely for toy DBs in GBs not TBs.

Anyway, it is not hard and controlling upgrades saves so much time. Having a clients db force upgraded when there is no budget for it sucks.

Anyway, I encourage you to learn/try it when you have opportunity


> I've never set up a highly available Postgres and Redis cluster on dedicated hardware, but I can not imagine it's easier than doing it in AWS which is only a few clicks

I haven ever setup a AWS postgres and redis, and know its more then a few clicks. there is simply basic information that you need to link between services, where it does not matter if its cloud or hardware, you still need to do the same steps, be it from CLI or WebInterface.

And frankly, these days with LLMs, its no excuse anymore. You can literally ask a LLM to do the steps, explain them to you, and your off to the races.

> I don't have to worry about OS upgrades and patches

Single command and reboot...

> Or a highly available load balancer with infinite scale.

Unless your google, overrated ...

You literally rent from places like Hetzner for 10 bucks a load balancer, and if your old fascion, you can even do a DNS balancing.

Or you simply rent a server 10x the performance what Amazon gives (for the same price or less), and you do not need a load balancer. I mean, for 200 bucks, you rent a 48 core 96 thread server at Hetzner... Who needs a load balancer again... You will do millions or requests on a single machine.


For anything "serious", you'll want a load balancer for high availability, even if there's no performance need. What happens when your large server needs an OS upgrade or the power supply melts down?


Well you can have managed resources on premises.

It costs people and automation.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: