Last week an issue came up where we have to upgrade from Postgres 12 to 16 because of EOL concerns. I said to another engineer, “Postgres is pretty boring so I bet it’ll be quite easy/low risk to do the upgrade.”
We got into a meta discussion about how we still have to approach it methodically, but the nice thing about boring, mature software is having a good gut feel that it probably won’t dip into the risk budget much at all.
Speaking of EOL-driven upgrades, the absurdly short "LTS" that most tech stacks have makes them decidedly not boring. Even Microsoft has fallen for this absurdity with 36 month "long" term support intervals for .NET versions. Are three any language stacks with an LTS that is at least five years (ideally much more) that aren't spelled in all caps?
Code is a liability and dependencies/vendors/libraries are liabilities I have even less control over. I hate when an internal release has to be "about" upgrading dependencies. I get it, I do. But I hate the idea of "we're going to focus on a thing that doesn't actually yield any tangible advancement of our goals."
Not that I'm unappreciative of what projects like Postgres, Django, etc. have given me. Like a good physician, I appreciate you and everything you do, but we'd both be happier seeing each other as rarely as possible.
I say that sometimes to people and they look at me weird. When you work in bigger projects with a lot of people it is harder to argue why you shouldn't write the code than to just do it.
Like, for example, I had to code some build-step that updated some assets that took about 5 seconds to run. That operation was done maybe once a month by other developers, during review another person asked why I didn't parallelize the process and cache already processed files and I was just like: it would add 200+ lines of extra code and error handling and it is not like I mind doing it, I just don't think it is worth the overhead of understanding this code and troubleshooting any possible bugs of this extra optimization code.
And it is harder to argue this kind of thing back and forth than it is to just do it. And now there are 200 extra lines of code that would take anyone else besides me at least an hour to grasp before they can make changes.
Same applies on discussing why you shouldn't add a dependency. If anything that is harder because you need to justify the extra time of not using the dependency.
Which is one of the most unappealing thing about the marketing material for LLMs services like Copilot. The issue was never the speed of writing code, more often than not, you're contemplating if you should write it. And if you need to, how much of it should you write and how to make the eventual rewrite easy.
If you're experienced enough, you either knew the rough way to code a task or realize that you need to take time to investigate the problem space. I don't think I ever ask myself what should I do to write more code with less effort. All the improvements I've made was to target precisely the thing I wanted to edit.
Java, it isn't all caps, and has three years support, plus 2 extended support, in the case of Oracle, other JVM vendors offer even longer times, e.g. from Azul, https://www.azul.com/products/azul-support-roadmap/
Most compilers of ISO languages, some of them were all caps named, others not.
Postgres upgrades were actually annoying the last time I did, where I had to explicitly import data from a previous version into the new one, instead of the software just automatically detecting that the data was a version behind and doing whatever to upgrade the format.
This would probably not have been as big of a headache, if it wasn't because it was running in a container, and was deployed as part of a separate project, meaning CloudNativePG (which probably handles this for you) was not an option.
If you're not using any weird extensions, you'll be fine. If your database is large with indexes and you're not doing it via a pg_dump/PG-restore, make sure to run a reindex as PG13 introduced index deduplication. That saved us terabytes (though cleaning index bloat probably played a role, too).
Major version OSS database upgrades? I would probably never consider this a easy/low risk thing, anytime you have any planner changes its something where having a baseline is really important to understand the impact to your workload.
It's doubly bad with postgres because the statistics get wiped after running pg_upgrade. They do tell you to run ANALYZE afterwards but that's yet more downtime.
That's not great, do the bucket counts change or something between versions? It seems like statistics would be a thing that ... should not change while you are not looking!
We got into a meta discussion about how we still have to approach it methodically, but the nice thing about boring, mature software is having a good gut feel that it probably won’t dip into the risk budget much at all.