The "runtime" word is very confusing in this case.
It's a web application server written in Rust (Axum/Tokio), that has hooks into Lua bindings (mlua) so that application behaviors can be implemented in Lua.
Looks neat. I built a programmable server for Server-Sent Events using a similar stack (Rust/mlua/axum) [1]. I think the Rust/Lua interop story is pretty good.
> I am surprised more projects don’t provide a Lua scripting layer.
Completely agree. I've been adding Lua scripting support to pretty much everything I make now. Most recently my programmable SSE server [1]. It's extended the functionality far beyond anything that I would have had the time and patience to do myself. Lua is such a treat.
I've been using Lua professionally and personally for some decades now, and I think that there is no need to be surprised that more projects don't provide a Lua scripting layer, because actually, Lua is everywhere. And the scripting aspect is one thing - but its not the only reason to use Lua.
Lua is so easily embeddable and functionally productive as a development tool, its not always necessarily necessary to have this layer exposed to the user.
Sure, engines and things provide scripting, and that is a need/want of the respective markets. Who doesn't love redis, et al.?
But Lua can be put in places where the scripting aspect is merely a development method, and in the end shipped product, things are tightly bound to bytecode as an application framework beyond scripting.
The point being, Lua is such a treat that it is, literally, in places you might least expect - or at least, will find under a lot of covers.
And after all, thats the beauty of it: as an embedded language/VM/function interface, Lua can sure scale to different territories.
Github was a "social network" from its very beginning. The whole premise was geared around git hosting and "social coding". I don't think it became enshittified later since that was the entire value proposition from day 1.
There are tons of places you can use for simple git hosting. The only reason to use github over the others is due to the social factors. Because everyone already has an account on it so they can easily file issues, PRs, etc. For simple git hosting, github leaves a lot to be desired.
While it's likely an easy process to drop in valkey, creating the new instances, migrating apps to those new instances, and making sure there's not some hidden regression (even though it's "drop in") all takes time.
At a minimum, 1 or 2 hours per app optimistically.
My company has hundreds of apps (hurray microservices). That's where "hundreds of hours" seems pretty reasonable to me.
We don't have a lot of redis use in the company, but if we did it'd have taken a bit of time to switch over.
Edit: Dead before I could respond but I figured it was worthwhile to respond.
> It's literally just redis with a different name, what is there to test?
I've seen this happen quite a bit in opensource where a "x just named y" also happens to include tiny changes that actually conflict with the way we use it. For example, maybe some api doesn't guarantee order but our app (in a silly manor) relied on the order anyways. A bug on us, for sure, but not something that would surface until an update of redis or this switch over.
It can also be the case that we were relying on an older version of redis, the switchover to valkey necessitates that we now bring in the new changes to redis that we may not have tested.
These things certainly are unlikely (which is why 1 or 2 hours as an estimate, it'd take more if these are more common problems). Yet, they do and have happened to me with other dependency updates.
At a minimum, simply making sure someone didn't fat finger the new valkey addresses or mess up the terraform for the deployment will take time to test and verify.
There's definitely pros and cons to this approach.
The pro being that every service is an island that can be independently updated and managed when needs be. Scaling is also somewhat nicer as you only need to scale the services under heavy load rather than larger services.
It also makes for better separation of systems. The foo system gets a foo database and when both are under load we only have to discuss increasing hardware for the foo system/database and not the everything database.
The cons are that it's more complex and consistency is nearly impossible (though we are mostly consistent). It also means that if we need a system wide replacement of a service like redis, you have to visit our 100+ services to see who depends on it. That rarely comes up as most companies don't do what redis did.
Indeed. Microservice zealot 'architects' love to ignore the work that has to into each microservice and the overhead of collaboration between services. They'll spend a couple of years pretending to work on that problem in any meaningful way, then move on to a different company to cause similar chaos
that’s a loaded statement without understanding the company.
Sometimes large companies acquire smaller companies and keep the lights on in the old system for a while. They may not even use a similar tech stack!
Sometimes they want to cleanly incorporate the acquired systems into the existing architecture but that might take years of development time.
Sometimes having a distributed system can be beneficial. Pros / cons.
and sometimes it’s just a big company with many people working in parallel, where it’s hard to deploy everything as a single web app in a single pipeline.
My understanding is that Valkey was forked directly from redis. So assuming you migrate at the forks point-in-time, then it literally is the same code.
Yes, but not the same infrastructure and configuration and documentation. Any reasonable operation will also do validation and assurance. That adds up if you have a sizable operation. "Hundreds of hours" is also not some enormous scale for operations that, say, have lots of employees, lots of data, and lots of instances.
The part you are thinking of is not the time consuming part.
At the very least you have to validate everything that touches redis, which means finding everything that touches redis. Internal tools and docs need to be updated.
And who knows if someone adopted post-fork features?
If this is a production system that supports the core business, hundreds of hours seems pretty reasonable. For a small operation that can afford to YOLO it, sure, it should be pretty easy.
The negative effect is that you have to bring the lawyers back in, and they tend to take an extremely conservative position on what might be litigated as offering “Redis-as-a-service”.
Many legal departments cannot afford nuance when a newsworthy license change occurs. The kneejerk reaction is to switch away to mitigate any business risk.
> I probably would have gone for turning the UaF into an type confusion style attack
I'm aware that Linux is nearly 40 years old at this point, and C is even decades older. But it is mind-boggling to me that we're still talking about UAFs and jumping from dangling pointers to get privileged executions in the 21st century.
That was obvious to C.A.R Hoare in 1980, should have been obvious to the industry after the Morris worm in 1988, yet here we are, zero improvements to the ISO C standard in regards to prevent exploits in C code.
Multics (written in PL/I) didn't suffer from buffer overflows. Ada was (and is) memory safe. Pascal had (and still has) range checks and bounded strings.
But we do have -fbounds-safety in clang (at least on macOS).
Nonsense, the C guy told me those happen to people that make mistakes, and that he, being the offspring of the Elemental of Logic, and "Hyperspace cybernetic intelligence and juvenile delinquent John Carmack" is completely immune to such pathetic issues. He works at Linux. Yes, all of Linux.
Yes, we need a languages that makes it impossible. But how could this happen in Rust? https://github.com/advisories/GHSA-5gmm-6m36-r7jh
But clearly, this does not count because it used "unsafe", so it remains the programmer's fault who made a mistake. Oh wait, doesn't this invalidate your argument?
Not really. If you can't be perfect, at least be good.
Unless you want to make an argument that searching all of your code base for UB is better than running on a relatively small subset (20-30%).
Or that Linux should use tracing GC + value types. Which I would find decent as well. But sadly LKML would probably disagree with inclusion of C# (or Java 35).
Does the world really move on? I think HN could be misleading here. And is moving to evermore complex solution the right answer to mainentance problems? I doubt it.
If you really want, you could write your code in Agda or similar and prove it correct. See also seL4 https://github.com/seL4/seL4 which is a proven correct kernel.
Wow, the Linux kernel must be full of dunces that rather than writing in proven correct C, added another language that other maintainers hate. What morons! /sarcasm
Disclaimer: above is sarcasm, while I don't think Linux Kernel Maintainers are perfect, I don't consider them dunces either. If they can't avoid writing UAF (use after free) or DF (Double free) bugs then what hope does the average C programmer has?
I'm not sure what you're even trying to say. Do you think that formally verified C isn't a thing? Because it's not used in Linux?
Linux isn't designed to be formally verifiable. For all of the manpower and resources dedicated to it, it's still bidden to the technical debt of being a 30 year old hobbyist experiment at its core. If you want a formally verified, secure kernel that wasn't written by a college student to see if he could, seL4 exists.
> Do you think that formally verified C isn't a thing? Because it's not used in Linux?
No. It exists but is it practical to be used in an OS? From what I gathered from HN, sel4 is very limited in scope and extending that to a semi-usable OS (that uses ""proven"" C) is expensive time and money wise.
> still bidden to the technical debt of being a 30 year old hobbyist experiment
Citation needed. I admit it has limitations but it's not like the biggest corps in the world aren't working on it.
> sel4 is very limited in scope and extending that to a semi-usable OS (that uses ""proven"" C) is expensive time and money wise.
Everything that has to meet a high criteria of safety is like this, yes. For the exact same reason unsafe exists in Rust and completely inescapable, because automatic correctness checking beyond the most trivial static analysis doesn't actually come for free, and it imposes intense design restrictions that make a lot of things very time consuming to do.
You will never find a single point in the multiple archives of Linux's git history where a clean break point occurred and the kernel was greenfielded with the express intent of replacing its design abstractions with ones more appropriate and ergonomic for formal analysis. If you'd like to express further skepticism, feel free to try and find that moment which never happened.
Participation of large, well recognized corporations does not imply the kernel isn't a mess of technical debt. It actually implies worse.
While the memory safety issues are a concern, switching to Rust + Unsafe will only reduce but not eliminate those issues and it is unclear whether adding a lot of complexity is actually worth it compared to other efforts to improve memory safety in C.
> switching to Rust + Unsafe will only reduce but not eliminate those issues
Except in practice the code written in Rust experiences no safety issues[1].
I've seen this argument before a million times it is one part FUD (by making actual memory issues bigger than they are) one part Nirvana fallacy (argments that make so Java isn't memory safe because it will have to call into C).
> is actually worth it compared to other efforts to improve memory safety in C
As I aluded before, I am sure Linux Kernel Maintainers are aware of sel4. Which begs the question, why they didn't do it? It's in C, proves everything, and solves even more than Rust, why isn't it in kernel?
I'm going to hazard a guess, the effort of going full proof for something like Linux would be through the roof. You would need lifetime of LKML dedication.
Software written in Rust also experiences safety issues. If you believe otherwise, you are delusional. In the end, if you want extremely safe software there is no other way than formal verification. (and then there can still hardware be issues or bugs in the specification)
The world is not black-and-white. One can have more or less issues.
And the same goes for improving the situation. Stopping the world and rewriting everything is never going to fly. The only viable approach is gradual introduction of safer techniques, be it a language or verification options, that step by step improve things and make the bar of adoption minimal. Nobody is going to get convinced if this is cast as a culture war. The folks annoyed by the status quo need to bring the skeptics along, everything else is doomed to fail (as well-illustrated by all the discussions here).
> Software written in Rust also experiences safety issues.
Yes. And?
Seatbelts and air bags all experience safety issues, and general public was against them for a very long time. It doesn't mean they don't increase safety. Or that because you could fall into volcano you should abolish seatbelts.
How about for start software with no memory errors? How about more XSS and less UAF?
Also, not even formal proofs are enough. A smart enough attacker will use the gap between spec and implementation and target those errors.
And while I agree we need more formal proofs and invariant oriented software design, those options are even less popular than Haskell.
> A smart enough attacker will use the gap between spec and implementation and target those errors.
"Formal verification" literally means that the implementation is proven to satisfy the formal specification, so by definition there won't be any gaps. That's the whole point of formal verification.
But you have a point in that there will still be gaps -- between specification and intent. That is, bugs in the specification. (Or maybe you mean "specification" to be some kind of informal description? That's closer to intent than an actual specification which in the formal verification world is a formal and rigorous description of the required behavior.) Though those bugs would likely be significantly less prevalent .. at least that's the expectation, since nobody has really been able to do this in the real world.
> "Formal verification" literally means that the implementation is proven to satisfy the formal specification, so by definition there won't be any gaps.
To quote Donald Knuth:
"Beware of bugs in the above code; I have only proved it correct, not tried it."
Sure but proof itself might have gaps itself. A famous proved implementation of mergesort had a subtle flaw when array size approached usize. They proved it would sort, they didn't prove it won't experience UB. Or maybe you prove it has no UB if there are no hardware errors (there are hardware errors), etc.
The unbeatable quantum cryptography was beat in one instance because the attacker abused the properties of detectors to get the signal without alerting either sides of the conversation.
> Or maybe you mean "specification" to be some kind of informal description?
I read somewhere that the secret to hacker mindset is that they ignore what the model is, but look at what is actually there. If they behaved as spec writers expected, they would have never been able to break in.
Reminds me of a place I worked. The doors were this heavy mass of wood and metal that would unlock with a loud thud (from several huge rods holding it secure). This was done to formally pass some ISO declaration for document security. The thing is any sane robber would just easily bypass the door by breaking the plaster walls on either side.
At least we can stop writing code made of plaster for start.
Then it's an incorrect proof. In the pen-and-paper era, that was a thing, but nowadays proofs are machine-checked and don't have "gaps" anymore.
Or your assumptions have a bug. That's exactly what I mean with specifications being potentially flawed, since all assumptions must be baked into the formal specification. If hardware can fail then that must be specified, otherwise it's assumed to not fail ever. This never ends, and that's the whole point here.
I suppose we are trying to say the same thing, just not with the same terminology.
The last version of C is ISO C 23. I also do believe that rewriting in Rust is the best way to address memory safety in legacy C code and not even for new projects, nor do I think that memory safety is the most pressing security problem.
I work with developers in SCA/SBOM and there are countless devs that seem to work by #include 'everything'. You see crap where they include a misspelled package name and then they fix it by including the right package but not removing the wrong one!.
The lack of dependency awareness drives me insane. Someone imports a single method from the wrong package, which snowballs into the blind leading the blind and pinning transitive dependencies in order to deliver quick "fixes" for things we don't even use or need, which ultimately becomes 100 different kinds of nightmare that stifle any hope of agility.
In a code review a couple of years ago, I had to say "no" to a dev casually including pandas (and in turn numpy) for a one-liner convenience function in a Django web app that has no involvement with any number crunching whatsoever.
Coincidentally, Copilot has been incredibly liberal lately with its suggestions of including Pandas or Numpy in a tiny non-AI Flask app, even for simple things. I expect things to get worse.
There's a ton you can do with sqlite, which is in the Python standard library. You just have to think about it and write some SQL instead of having a nice Pythonic interface.
To push back on this, I consider pandas/numpy so crucial to Python as a whole they are effectively stdlib to me. I wouldn't blink at this because it would happen sooner or later.
Unless is was absolutely critical the server have as small as a footprint as humanly possible and it was absolutely guaranteed there would never need to be included in the future of course. However, that first constraint is the main one.
>> their software also downloaded a 250MB update file every five minutes
> How on earth is a screen recording app 250 megabytes
How on earth is a screen recording app on a OS where the API to record the screen is built directly into the OS 250 megabytes?
It is extremely irresponsible to assume that your customers have infinite cheap bandwidth. In a previous life I worked with customers with remote sites (think mines or oil rigs in the middle of nowhere) where something like this would have cost them thousands of dollars per hour per computer per site.
For a long time iOS did not have features to limit data usage on WiFi. They did introduce an option more recently for iPhone, but it seems such an option is not available to MacOS. Windows supported it as far as I could remember using it with tethering.
Or.. Why on earth you need to check for updates 288x per day. It sounds and seems more like 'usage monitoring' rather than being sure that all users have the most recent bug fixes installed. What's wrong with checking for updates upon start once (and cache per day). What critical bugs or fixes could have been issued that warrant 288 update checks.
8 of the 10 Gbits I meant :) sorry I see it reads a bit weird yes. So 8 gbit for single line is the max currently. But huge competition on the horizon, so I expect soon more :)
I read it as 1 x 8GB connection
But that’s only because I think 8GB is offered in my area.
I’ve limited my current service to 1.5GB / 1GB fibre, because well I only run gbit Ethernet … so more sounds totally unnecessary
It sounds right, and this is the kind of thing I'd expect if developers are baking configuration into their app distribution. Like, you'd want usage rules or tracking plugins to be timely, and they didn't figure out how to check and distribute configurations in that way without a new app build.
They probably just combined all phoning home information into one. Usage monitoring includes version used, which leads to automatic update when needed (or when bugged...).
Is it normal to include the Electron framework like that? Is it not also compiled with the binary? Might be a stupid question, I'm not a developer. Seems like a very, very heavy program to be doing such a straightforward function. On MacOS, I'm sure it also requires a lot of iffy permissions. I think I'd stick with the built-in screen recorder myself.
I'm not sure why there are both app.asar and app.asar.unpacked. I did just run `npx @electron/asar extract app.asar` and confirmed it's a superset of app.asar.unpacked. The unpacked files are mostly executables, so it may have something to do with code signing requirements.
Tauri is not as platform-agnostic as Electron is because it uses different web views depending on the platform. I ran into a few SVG-related problems myself when trying it out for a bit.
For example, on Linux, it uses WebKitGTK as the browser engine, which doesn't render the same way Chrome does (which is the web view used on Windows), so multi-platform support is not totally seamless.
Using something like Servo as a lightweight, platform-independent web view seems like the way forward, but it's not ready yet.
I suspect the real reason electron got used here is that ChatGPT/Copilot/whatever has almost no Tauri example code in the training set, so for some developers it effectively doesn't exist.
It's about time Linux desktops adopt some form of ${XDG_WEB_ENGINES:-/opt/web_engines} convention to have web-based programs to fetch their engines as needed and play nice with each other.
It's relevant in the broader context, cross-platform is a significant reason people choose Electron, and lighter alternatives like Tauri still have some issues there.
seconded -- tried to use tauri for a cross-platform app but the integrated webview on android is horrendous. had to rewrite basic things from scratch to work around those issues, at which point why am I even using a "cross-platform" framework?
I imagine if you stick to desktop the situation is less awful but still
Screen recording straight from a regular browser window, though it creates GIFs instead of video files. Links to a git repo so you can set it up locally.
The app itself is probably much bigger than 250mb. If it is using Electron and React/other JS library like a million other UIs just the dependencies will be almost that big.
For context, the latest iOS update is ~3.2GB, and the changelog highlights are basically 8 new emojis, some security updates, some bug fixes. It makes me want to cry.
For the vast, vast, vast majority of people, if you don't have an obvious primary key, choosing UUIDv7 is going to be an absolute no-brainer choice that causes the least amount of grief.
Which of these is an amateur most likely to hit: crash caused by having too small a primary key and hitting the limit, slowdowns caused by having a primary key that is effectively unsortable (totally random), contention slowdowns caused by having a primary key that needs a lock (incrementing key), or slowdowns caused by having a key that is 16 bytes instead of 8?
Of all those issues, the slowdown from a 16 byte key is by far the least likely to be an issue. If you reach the point where that is an issue in your business, you've moved off of being a startup and you need to cough up real money and do real engineering on your database schemas.
The problem is that companies tend to only hire DB expertise when things are dire, and then, the dev teams inevitably are resistant to change.
You can monitor and predict the growth rate of a table; if you don’t know you’re going to hit the limit of an INT well in advance, you have no one to blame but yourself.
Re: auto-incrementing locks, I have never once observed that to be a source of contention. Most DBs are around 98/2% read/write. If you happen to have an extremely INSERT-heavy workload, then by all means, consider alternatives, like interleaved batches or whatever. It does not matter for most places.
I agree that UUIDv7 is miles better than v4, but you’re still storing far more data than is probably necessary. And re: 16 bytes, MySQL annoyingly doesn’t natively have a UUID type, and most people don’t seem to know about casting it to binary and storing it as BINARY(16), so instead you get a 36-byte PK. The worst.
> contention slowdowns caused by having a primary key that needs a lock (incrementing key)
This kind of problem only exists in unsophisticated databases like SQLite. Postgres reserves whole ranges of IDs at once so there is never any contention for the next ID in a serial sequence.
I think you’re thinking out the cache property of a sequence, but it defaults to 1 (not generating ranges at once). However, Postgres only needs a lightweight lock on the sequence object, since it’s separate from the table itself.
MySQL does need a special kind of table-level lock for its auto-incrementing values, but it has fairly sophisticated logic as of 8.0 as to when and how that lock is taken. IME, you’ll probably hit some other bottleneck before you experience auto-inc lock contention.
I was in the never-UUID camp, but have been converted. Of course depends on how much do you depend on your PKs for speed, but using UUIDs has a a great benefit in that you can create a unique key without a visit to the DB, and that can enormously simplify your app logic.
I’ve never understood this argument. In every RDBMS I’m aware of, you can either get the full row you just inserted sent back (RETURNING clause in Postgres, MariaDB, and new-ish versions of SQLite), and even in MySQL, you can access the last auto-incrementing id generated from the cursor used to run the query.
Now imagine that storing the complete model is the last thing you do in a business transaction. So the workflow is something like 'user enters some data, then over the course of the next minutes adds more data, the system contacts various remote services that too can take long time to respond, the user can even park the whole transaction for the day and restore it later', but you still want to have an unique ID identifying this dataset for logging etc. There is nothing you can insert at the start (it won't satisfy the constraints and is also completely useless). So you can either create a synthetic ID at the start but it won't be the real ID when you finally store the dataset. Or you can just generate an UUID anywhere anytime and it will be a real ID of the dataset forever.
So have a pending table with id, user_id, created_at, and index the latter two as a composite key. SELECT id FROM pending WHERE user_id = ? ORDER BY created_at DESC LIMIT 1.
Preferably delete the row once it's been permanently stored.
Keeping an actual transaction open for that long is asking for contention, and the idea of having data hanging around ephemerally in memory also seems like a terrible idea – what happens if the server fails, the pod dies, etc.?
> Keeping an actual transaction open for that long is asking for contention
Yes, which is why this is not an actual db transaction, it's a business transaction as mentioned.
> and the idea of having data hanging around ephemerally
The data is not ephemeral of course. But also the mapping between business transaction and model is not 1:1, so while there is some use for the transaction ID, it can't be used to identify a particular model.
> With UUIDs there is absolutely no need for that.
Except now (assuming it’s the PK) you’ve added the additional overhead of a UUID PK, which depending on the version and your RDBMS vendor, can be massive. If it’s a non-prime column, I have far fewer issues with them.
> The data is not ephemeral of course.
I may be misunderstanding, but if it isn’t persisted to disk (which generally means a DB), then it should not be seen as durable.
The whole content of the business transaction is persisted as a blob until it's "done" from the business perspective. After that, entities are saved(,updated,deleted) to their respective tables, and the blob is deleted. This gives the users great flexibility during their workflows.
Yes the overhead of UUIDs was something I mentioned already. For us it absolutely makes sense to use them, we don't anticipate to have hundreds of millions of records in our tables.
I also do that for convenience. It helps a lot in many cases. In other cases I might have tables that may grow into the millions of rows (or hundreds of millions), then I'd absolutely not use UUID PK's for those particular tables. And I'd also shard them across schemas or multiple DBs.
If you don't have a natural primary key (the usual use case for UUIDs in distributed systems such that you can have a unique value) how do you handle that with bigints? Do you just use a random value and hope for no collisions?
Wouldn't you just have an autoincrementing bigint as a surrogate key in your dimension table?
Or you could preload a table of autoincremented bigints and then atomically grab the next value from there where you need a surrogate key like in a distributed system with no natural pk.
Yes, if you have one database. For a distributed system though with many databases sharing data, I don't see a way around a UUID unless collisions (the random approach) are not costly.
Yes. This is the way so long as you can guarantee you wont grow past the bits.
Otherwise you can still use the pregenerated autoincrements. You just need to check out blocks of values for each node in your distributed system from the central source before you would need them:
N1 requests 100k values, N2 requests 100k values, etc. Then when you've allocated some amount, say 66%, request another chunk. That eay you have time to recover from a central manager going offline before it's critical.
I have no problem with using uuids but there are ways around it if you want to stick with integers.
But if you use sequential integers as primary key, you are leaking the cardinality of your table to your users / competitors / public, which can be problematic.
Is it really enormous? bigint vs UUID is similar to talking about self-hosting vs cloud to stakeholders. Which one has bigger risk of collision? Is the size difference material to the operations? Then go with the less risky one.
You shouldn't be using BIGINT for random identifiers so collision isn't a concern - this is just to future proof against hitting the 2^31 limit on a regular INT primary key.
Twice now I was called to fix UUIDs making systems crawl to stop.
People underestimate how important efficient indexes are on relational databases because replacing autoincrement INTs with UUIDs works well enough for small databases, until it doesn't.
My gripe against UUIDs is not even performance. It's debugging.
Much easier to memorize and type user_id = 234111 than user_id = '019686ea-a139-76a5-9074-28de2c8d486d'
I’ll go further; don’t automatically default to a BIGINT. Put some thought into your tables. Is it a table of users, where each has one row? You almost certainly won’t even need an INT, but you definitely won’t need a BIGINT. Is it a table of customer orders? You might need an INT, and you can monitor and even predict the growth rate. Did you hit 1 billion? Great, you have plenty of time for an online conversion to BIGINT, with a tool like gh-ost.
I work for a company that deals with very large numbers of users. We recently had a major project because our users table ran out of ints and had to be upgraded to bigint. At this scale that's harder than it sounds.
So your advice that you DEFINITELY won't need a BIGINT, well, that decision can come back to bite you if you're successful enough.
(You're probably thinking there's no way we have over 2 billion users and that's true, but it's also a bad assumption that one user row perfectly corresponds to one registered user. Assumptions like that can and do change.)
I’m not saying it’s not an undertaking if you’ve never done it, but there are plenty of tools for MySQL and Postgres (I assume others as well) to do zero-downtime online schema changes like that. If you’ve backed yourself into a corner by also nearly running out of disk space, then yes, you’ll have a large headache on your hands.
Also, protip for anyone using MySQL, you should take advantage of its UNSIGNED INT types. 2^32-1 is quite a bit; it’s also very handy for smaller lookup tables where you need a bit more than 2^7 (TINYINT).
> but it's also a bad assumption that one user row perfectly corresponds to one registered user. Assumptions like that can and do change.
There can be dupes and the like, yes, but if at some point the Customer table morphed into a Customer+CustomerAttribute table, for example, I’d argue you have a data modeling problem.
Americans in the market for a "premium" shower head are clearly not looking for the cheapest thing on the market. So it's obvious that they would be willing to spend more for the added feel-good of a domestic product.
It's a web application server written in Rust (Axum/Tokio), that has hooks into Lua bindings (mlua) so that application behaviors can be implemented in Lua.