It’s interesting to read a lot of push back to the points here.
I recently built a product with the backend using SQLite as the data store and ran into all these issues and many more. It is frustrating. I use SQLAlchemy and Alembic. It seemed everywhere I turned, the docs said “it works this way in all databases, except SQLite where X isn’t supported or you have to do Y differently.”
I think with litestream and D1 and other web SQLite tech emerging, you see the sentiment: “if you don’t have Google-scale, you can easily serve using disk-backed SQLite, plus enjoy skipping RTT network latency to DB.” Then, when someone does that and has a bad time, the comments instead go: “SQLite is only for embedded data stores.”
Personally, if I had to do it again, I would stick to the most boring tech for the target stack: Postgres and Django’s ORM.
I'm frustrated by someone reaching the conclusion that there can ever be a "best database". I spent several hours today writing a bash script that jammed CSV files into sqlite3 and ran some queries on them. Using postgres for this would have been insane, and I can imagine writing the exact opposite of this article if I had tried to do so: "It stores everything all over the place! There's no easily configurable in-memory storage option! All the docs and packages expect you to be running it as a service! Running my script takes an entire ansible playbook just to set things up! Why the hell is there a user table already? This thing uses hundreds of megs of memory at rest sometimes!"
Sqlite is not just for embedded data stores. However, it makes trade-offs to achieve ease of use and performance on certain workloads. If it tried to be postgres then it would be an inferior competitor; instead, it is an alternative that suits some use cases better and others worse. If you're trying to build a web app that can be deployed and backed up as two files, an executable and a db, then you'll probably want sqlite. If you're trying not to shoot yourself in the foot with "oh god why can't I do a full outer join and why are all the dates weird" then use postgres.
I imagine programmers would also find cockroachdb frustrating compared to postgres, if they didn't benefit from anything it offered and used it anyway.
I think this is the best advice. The further you are away from your data store, the more you are relying on the used cases others have considered. Just because a library needs features that something like SQLite doesn’t provide doesn’t mean that you need that feature. But every extra abstraction in your stack has this danger.
I also think that the closer you are to your DB, more you appreciate the features you actually need are. A developer should know how their data is ultimately stored. And if an ORM hides this from you, that’s a problem.
Note: and ORM doesn’t need to hide this and can be a rational way to manage DB storage. But it’s then up to the developer to manage that abstraction. I’ve done this before where I used an ORM but I still knew exactly what SQL was going to be generated. But for many, especially new devs, the ORM is a black box.
I'm curious why you used SQLAlchemy and Alembic when (based on your description) Django's support for SQLite sounds a lot nicer? According to the docs, there are a handful of caveats with the Django ORM and SQLite, but nothing too bad it would seem.
Good question! I was (am) using FastAPI, not Django. Another decision that was easy to get started with, but I think is turning out to be more difficult to scale. Maybe.
I recently built a product with the backend using SQLite as the data store and ran into all these issues and many more. It is frustrating. I use SQLAlchemy and Alembic. It seemed everywhere I turned, the docs said “it works this way in all databases, except SQLite where X isn’t supported or you have to do Y differently.”
I think with litestream and D1 and other web SQLite tech emerging, you see the sentiment: “if you don’t have Google-scale, you can easily serve using disk-backed SQLite, plus enjoy skipping RTT network latency to DB.” Then, when someone does that and has a bad time, the comments instead go: “SQLite is only for embedded data stores.”
Personally, if I had to do it again, I would stick to the most boring tech for the target stack: Postgres and Django’s ORM.