"When you try to CRAM everything (mail, webserver, gitlab, pop3, imap, torrent, owncloud, munin, ...) into a single machine on Debian, you ultimately end-up activating unstable repository to get the latest version of packages and end-up with conflicting versions between softwares to the point that doing an apt-get update && apt-get upgrade is now your nemesis."
git, postgresql, UPS monitoring, NTP, DNS, and DHCPd.
Firewalling, more DNS and the other part of DHCPd failover is on the router.
Package update is a breeze. The only time I waste the overhead of a virtual machine is when I'm testing out new configurations and don't want to break what I have.
"just having the Kubernetes server components running add a 10% CPU on my Intel(R) Atom(TM) CPU C2338 @ 1.74GHz."
Containerization is not a win here. Where's the second machine to fail over to?
Containerization and container orchestration platforms are only partly about scalability.
The primary appeal for me is ease of deployment and reproducibility. This is why I develop everything in Docker Compose locally.
Maybe the equivalent here would be something like Guix or Nix for declaratively writing the entire state of all the desired system packages and services + versions but honestly (without personal experience using these) they seem harder than containers.
I'm not deploying; this is the server. I do backups, and I keep config in git.
Reproducibility? This is the server. I will restore from backups. There is no point in scaling.
If you want to argue that containerization and VMs are portable and deployable and all that, I agree. This is not a reasonable place to do that extra work.
Hey, do whatever floats your boat. Nobody said there is a single solution to every problem.
Don't pick up a fight because you are satisfied with your solution that is different from somebody else's solution.
I personally like docker-compose and Vagrant for my private services and development environments.
I use Vagrant for when I need a complete VM. Think in terms of VM for doing embedded development where I need a large number of tools in very specific versions and I need them still working in 3 years without maintenance even if I change a lot about my PC setup (I run Linux everywhere).
I create separate Vagrant for every project and this way I can reinstate complete environment at a moments notice, whenever I want.
I use docker-compose for most everything else. Work on an application that needs MongoDB, Kafka, InfluxDB, Graphana and so on and so forth? Docker Compose to rule them all. You type one command and everything's up. You type another and everything's down.
I use the same for my other services like mail, NAS, personal website, database, block storage, etc. Containers let me preserve the environment and switch between versions easily and I am not tied to the binary version of the Linux on the server.
I hate it when I run a huge amount of services and then a single upgrade causes some of them to stop working. I want to be able to be constantly updated and have my services working with minimum maintenance. Containers let me make decision on each of the services separately.
> Reproducibility? This is the server. I will restore from backups.
To me, reproducibility is more than about restoring the old bucket of bits I had. It's about understanding, about being able to reproduce the means that a system got the way it is.
With Kubernetes, there is a centralized place where where the cluster state lives. I can dump these manifests into a file. The file is human readable, well structured, consistently structured, uniformly describes all the resources I have. Recreating these manifests elsewhere will let me reproduce a similar cluster.
The resources inside a kubernetes cluster are just so much easier to operate on, so much easier to manage than anything else I've ever seen. Whether I'm managing SQS or Postgres or Containers, being able to have one resource that represents the thing, having a manifest for the thing, is just so much more powerful, so much better an operational experience than either having a bucket of bits filesystem with a bunch of hopefully decently documented changes over time on it, or a complex Puppet or Ansible system that can enact said bucket of bits. Kubernetes presents high level representations for all the things on the system, of all shapes and sizes, and that makes knowing what I have much easier, and it makes managing, manipulating, replicating those resources much much easier & more straightforward.
Wrapping a new abstraction layer around a single server does not help, it is an expense you do not need. "Recreating these manifests elsewhere" will not work, because there is no elsewhere.
You cannot add complexity to a system to make it simpler.
You cannot abstract away the configuration of a system when there is only one system: you must actually do the configuration.
There is no point in having a high-level representation of all the things on the system: you have the actual things on the system, and if you do not know how to configure them, you should not be running them.
> There is no point in having a high-level representation of all the things on the system: you have the actual things on the system, and if you do not know how to configure them, you should not be running them.
> You cannot abstract away the configuration of a system
I've spent weeks setting up postgres clusters, with high availability, read only replicas, backups, monitoring, alerting.
It takes me 30 minutes to install k3s, the postgres operator, & recreate that set up.
Because there are good consistent abstractions used up & down the Kubernetes stack. That let us build together, re-use the deployable, scalable architectures of lower levels, across all our systems & services & concerns. That other operators will understand better than what I would have hand built myself.
> ecreating these manifests elsewhere" will not work, because there is no elsewhere.
It'll work fine if you had an other elsewhere. Sure backups don't work if you have nothing to restore on.
Dude this is such a negative attitude. I want skepticism, criticalness, but we don't have to hug servers so close forever. We can try to get good at managing things. Creating a data plane & moving our system configurations into it is a valid way to tackle a lot of management complexity. I am much relaxed, many other operators are, and getting such a frosty negative "nothing you do helps" dismissal does not feel civil.
Not sure about reproducibility.
If the HD fails, sure, restore from backup. But what if the motherboard fails, and you buy/build a completely new machine. Does a backup work then, even if all the hardware is different? That's where a container makes restoring easier.
A container does not make restoring easier in the situation you have described.
The host for the containers still needs to be configured. That's where changes to NIC identifiers, etc. need to be handled.
In my situation, the host gets exactly the same configuration. The only things that care about the name of the NIC are a quick grep -r away in /etc/; 95% of everything will be up when I get the firewall script redone, and because that's properly parameterized, I only need to change the value of $IF_MAIN at the top.
I've not met a linux system tarball that I can't drop on any other machine with the same CPU architecture, and get up and running with only minor tweaks network device names.
> Does a backup work then, even if all the hardware is different
Full disk backup, Linux ? Most likely. We rarely recompile kernels these days to tailor to some specific hardware, most are supported via modules. It could be that some adjustments are going to be necessary (network interface names? nonfree drivers). For the most part, it should work.
Windows? YMMV. 10 is much better than it was before and has more functional disk drivers out of the box. Maybe you need to reactivate.
The problem is mostly reproducibility. A system that has lived long enough will be full of tiny tweaks that you don't remember about anymore. Maybe it's fine for personal use but it has a price.
Even personal servers (including Raspberry Pis) I try to keep some basic automation in place so if they give up the ghost, they are cattle. Not pets.
Drivers, config you missed/didn't realise was relevant/wasn't needed before, IDs (e.g. disks), etc.
Nix or aconfmgr (for Arch) help.
I still like containers for this though. Scalability doesn't mean I'm fooling myself into thinking hundreds of thousands of people are reading my blog, it means my personal use can outgrow the old old PC 'server' it's on and spill into the new old one, for example. Or that, for simplicity of configuration, each disk will be (the sole disk) mounted by a Pi.
There's more than one way to skin a cat. If you're running something as simple and low profile as OP suggested, all you need to backup from the system are the packages you installed and a handful of configurations you changed in /etc. That could be in ansible, but it could be just a .sh file, really. You'll also need a backup of the actual data, not the entire /. Although, even if all you did was backup the entire / there's a good chance it would work even if you try to recover it in new hardware.
The services metioned by OP don't need to talk to each other, they are all things that work out of the box by just running apt-get install or equivalent. You don't need anything really fancy and you can set up a new box with part of the services if they are ever taking too much resources (which, for a small setup, will likely never really happen. At least in my experience)
Why do you feel the need to keep config in git if you've got backups? I think the answer to that is the same reason that I'd rather keep a record of how the server is customised than a raw disk backup.
I do think containerisation and VMs are more overhead than they're worth in this case, but there's definitely a lot of value in having a step-by-step logical recipe for the server's current state rather than just a snapshot of what's currently on the disk. (I'd favour puppet or something similar).
> I keep config in git so that when I screw up, I can figure out how.
Right. Which is exactly why I want installing a new service, upgrading a library etc. to be in git rather than just backing up what's on disk. A problem like not being able to connect to MySQL because you've upgraded the zoneinfo database, or the system root certificates, is a nightmare to diagnose otherwise.
Exactly! Other non-scalability concerns they address (specifically talking about Kubernetes here) is a primitive amount of monitoring/observability; no-downtime updates (rolling updates); liveness/readiness probes; primitive service discovery and load balancing; resiliency to any one single host failing (even if the total compute power could easily fit into a single bigger server).
I can agree that the idea of reaching for Kubernetes to set up a bunch of services on a home server sounds a bit absurd.
"How did we get here?"
I'm not an inexperienced codemonkey by any means of the term, but I am a shitty Sysadmin. And despite being a Linux user from early teens, I'm not a greybeard.
As sorry a state as it may sound, I have more faith in my ability to reliably run and maintain a dozen containers in k8s than a dozen standard, manually installed apps + processes managed by systemd.
Whether this is a good thing or a bad thing you can likely find solid arguments both ways for.
Hm, these days I feel like I only have to learn systemd. Reload config? View logs? Watchdog? Namespaces? It’s all systemd. If you are running on one machine, what does Docker/k8s give you that you do not already have?
Nothing, but its pretty common to have the home server plus a desktop/laptop where you do most of the work (even for home server), that may not be linux - in which case containers are the easiest way
"Classic" approach turned out to be maddeningly harder.
Everything, even inside single "application", going slightly off the reservation. Services that would die in stupid ways. Painful configuration that would have been abstracted out were I running containers on k8s (some benefits might be realized with Docker compose, but docker on its own is much more brittle than k8s).
So much SSH-ing to a node to tweak things. apt-get fscking the server. Etc.
To me it sounds like the latter, a classic server, which I agree... After getting comfortable with containerized deployment, "classic" servers are a huge pain
Fair enough, my point was more about using k8s to deploy applications rather than “house server” stuff, where it’s indeed unneeded more often than not.
Having zero downtime updates is quite nice. For example, I can set FluxCD to pin to a feature release of Nextcloud, and it will automatically apply any patch updates available. Because of the zero downtime updates, this can happen at any time and I won't have any issues, even if I'm actively using Nextcloud as it's happening.
> This is why I develop everything in Docker Compose locally.
For a small setup like this, just having a docker compose file in version control is more than sufficient. You can easily leverage services someone else has set up, and the final config is easy to get going again if you need to rebuild the machine due to hardware failure.
Some stuff is really tricky to setup too - like postfix with working TLS, DKIM etc. Before Docker I'd eventually get stuff like this working, then a couple of years something would break and I'd have no clue how to fix it because I hadn't touched it for so long. With Docker (and Compose and Swarm), everything is codified in scripts and config files, all ready to be deployed anywhere.
It is because at the time I was doing a lot of python development, and I was (and still) using my server as a dev workstation.
Isolation with virtualenv was not great and many projects were needing conflicting versions of system package, or newer version than what Debian stable had.
Lot of the issue was me messing around \o/
"just having the Kubernetes server components running add a 10% CPU on my Intel(R) Atom(TM) CPU C2338 @ 1.74GHz."
Containerization is not a win here. Where's the second machine to fail over to?
I think it is worth it in order to get a centralized control plane for everything and automatic build and deployment for eveything.
But I agree with you, some apps (postfix, dovecot) doesn't feel great inside a container (Sharing data with UID issue is mewh, postfix with multiprocess design also...)
I just wanted to have everything manage into containers, as they were the last ones, so I moved them into.
> I was (and still) using my server as a dev workstation
This seems like a very bad idea, and I'm not at all surprised you had problems. But it doesn't look like the problems were with the server part; if your machine had only been a server you could have avoided all the stuff about needing to pull from unstable. So I don't think "don't put all the server stuff on one machine" is the real takeaway from your experience; I think the real takeaway is "don't use the same machine as both a server and a dev workstation".
Well, at that point you just move the problem from "how to manage the home server" to "how to manage the dev workstation". You need somewhere where you can install not just random Python packages but also random databases, task queues etc. during development. I guess "accept that your dev box will always be flaky and poorly understood, you'll have to spend time productionising anything before you can deploy it anywhere else, and if you replace it you'll never get things set up quite the same" is one possible answer (and perhaps the most realistic), but it's worth looking for a better way.
> at that point you just move the problem from "how to manage the home server" to "how to manage the dev workstation"
No, you separate it into two problems that are no longer coupled to each other. The requirements for a server are very different from those for a dev workstation, so trying to do both on the same machine is just asking for trouble.
> You need somewhere where you can install not just random Python packages but also random databases, task queues etc. during development.
Yes, that's what a dev workstation is for. But trying to do that on the same machine where you also have a server, which doesn't want all that stuff, is not, IMO, a good idea.
> I guess "accept that your dev box will always be flaky and poorly understood
It will be as flaky and poorly understood as the code you are developing and whatever it depends on, yes. :-)
But again, you don't want any of that on a machine that's a server. That's why it's better to have a server on a different machine.
The biggest objection in this thread is to the 10% overhead of containers, so it seems strange to see the 100% overhead of two separate computers as a better solution.
And at some point the code has to go from dev code to production code. If you're managing dev and production in different ways, then you're going to have to spend significant time "productionising" your dev code (listing dependencies in the right formats etc.). And the bigger the gap between the machine you develop on and the machine you deploy to, the higher the risk of production-only bugs. So keeping your dev workstaion as similar as possible to a production server - and installing dependencies etc. in a way that's compatible with production from day 1 - makes a lot of sense to me.
We seem to be talking about different kinds of servers. You say:
> at some point the code has to go from dev code to production code. If you're managing dev and production in different ways, then you're going to have to spend significant time "productionising" your dev code
This is true, but as I understand the article we are talking about, it wasn't talking about a dev workstation and a production server for the same project or application. I can see how it could make sense to have those running on the same machine (but probably in containers).
However, the article was talking about a dev workstation and a home server which had nothing to do with developing code, but was for things like the author's personal email and web server. Trying to run those on the same machine was what caused the problems.
I presume what the author is developing is code that they're eventually going to want to run on their home server, at least if they get far enough along with it. What else would the end goal of a personal project be?
Reading this chain, you seem to want it both ways, that a Dev machine runs unstable config and is in an unknown state due to random package installation, but the same machine should be stable and reproducible.
Yes, that's exactly why the OP's approach is appealing! I want it to take minimum effort to install some random new package/config/dependency, but I also want my machine to be stable and reproducible.
> "When you try to CRAM everything (mail, webserver, gitlab, pop3, imap, torrent, owncloud, munin, ...) into a single machine on Debian, you ultimately end-up activating unstable repository to get the latest version of packages and end-up with conflicting versions between softwares to the point that doing an apt-get update && apt-get upgrade is now your nemesis."
I use Proxmox to avoid that. Some things I run in VMs (often with Docker containers), some other things I run them in LXC containers (persistant containers that behave like VMs).
I can then automation (mostly Proxmox templates and Ansible) to make deployments repeatable.
I'm interested in k3s, though, I'll give it a better look :)
The next addition will be some form of NAS, either a qnap/synology or a custom build using FreeNAS or Unraid (probably FreeNAS).
"This is one of the things that containerization solves"
No. Containerization does not fix a broken system. Only fixing the broken system does that. Containerization lets you apply your fix to all the hosts that you want fixed, and as we have thoroughly established, the number of those hosts in this scenario is one.
So far I have been told that containerization fixes configuration problems and allows multiple services to be configured the same way. No container will fix a typo in /etc/dovecot/config.d/20-imap.conf, and no container management system will make nginx.conf look like sendmail.cf.
"running/updating images are all single line commands"
There is some kind of elven glamour being cast over Kubernetes, Docker, and other container/VM/serverless systems that confuses people about the difference between configuring a service to be useful and managing the lifecycle of that service over a scalable number of machines. Docker cannot update an image that you have not already fixed.
This reminds me of the MBA illusion, that claims that all management can be performed most efficiently by a management specialist with no particular knowledge or skills in the actual production process; all of that is irrelevant detail for somebody else to do.
I assure you that detailed understanding is the sine qua non of getting things done.
Docker solves the problem of breaking what you have because you can't. The isolation is built-in. Everything is contained.
Of course if you're editing files locally or fixing a Dockerfile it's all the same, but it's not the commonality that's in discussion here.
And what's all this about understanding? Who said we don't understand the software? That's your assumption but complete untrue. Managing everything directly isn't a sign of deep understanding, it's simply your preference, one that I and many others don't share.
Abstractions exist for a reason. While this particular article might be over-leveraged for their scenario, it says absolutely nothing about the quality and need of those abstractions elsewhere.
If you're concerned about resource usage, at least with Swarm or Docker Compose, you get things like health checks, restart policies and replication for free with minimal overhead. Scaling horizontally is easy, too.
It's really nice to have your infrastructure described as code versus in some configuration management tool or shell scripts, and for that reason alone I'll use it even on a single machine.
Containers have a pretty poor security record. Frankly I'd feel more safe with a non-containerised service running under a non-root user than with a service running as root in a container, not that I'd feel particularly safe with either.
Linux has had support for multiple users for quite some time now. Popping a standard service process doesn’t get you any more than the privileges of the running user, which is usually scoped to that service alone.
Install Ubuntu, Postgres+apache2, then su to www-data and try read some data from the postgres data directory to see what I mean.
You'd need to do that with a container too, if the volume mount for /var/lib/posgresql/data still has the older version of posgres data then if you update the container then that needs to be done too. Alternatively a dump and reimport.
> Containerization is not a win here. Where's the second machine to fail over to?
It would probably take an hour or less to add another node to this set up. The author made some choices that block scaling, & changing that, installing service-lb, local-provisioner, would take a bunch of that time. Finding a resilient replacement for local-provisioner is something I wish we were better at, but the folk at Rook.io have a pretty good start on this.
To me, the real hope is that we move more and more of our configuration into Kubernetes state. Building a container with a bunch of baked in configuration is one thing, but I hope we are headed towards a more "cloud native" system, where the email server is run not as containers, but as an operator, where configuration is kept in Kubernetes, and the operator goes out & configures the containers to run based on that.
I agree that running a bunch of service on a Debian box with a couple different releases (testing/unstable) pinned into apt is not really that hard. But I am very excited to stop managing these pets. And I am very hopeful that we can start moving more and more of our configuration from /etc/whateverd/foo.conf files into something centrally & consistently managed. The services themselves all require special unique management today, & the hope, the dream, is that we get something more like big cloud style dashboards, where each of these services can be managed via common Kubernetes tools, that apply across all our services.
"But I am very excited to stop managing these pets."
When you have a herd of cattle which is of size 1, it's a pet. You don't get any efficiencies from branding them all with laser-scannable barcodes, an 8-place milking machine, or an automatic silage manager. You still need to call the vet, and the vet needs to know what they are doing.
Having a consistent, managed experience with good top down controls is, in my world, far more efficient than tackling each service like a brand new problem to manage & operate independently.
You listed 21 different pieces of software, 21 different needs in your post.
For some reason almost everyone commenting here seems to think it's totally unreasonable to try to use a consistent, stable tool to operate these services. Everyone here is seems totally convinced that, like you, it's better to just go off & manage 21 services or so independently, piece by piece, on a box.
If it were just postfix, fine, sure, manage it the old fashioned way. Just set up some config files, run it.
But that's not a scalable practice. None of other 20 pieces of software are going to be managed quite like that. Tools like systemd start to try to align the system's services into a semi-repeatable practice, but managing configuration is still going to be every-service-for-itself. Trying to understand things like observability & metrics are going to be highly different between systems. It seems so past due that we start to emerge some consistent ways to manage our systems. Some consistent ways of storing configuration (in Custom Resources, ideally), of providing other resources (Volumes), of exposing endpoints (Endpoints). We can make real the things that we have, so far, implicitly managed & operated on, define them, such that we can better operate on them.
It's not about containers. It's about coherent systems, which drive themselves to fulfill Desired State. Containers are just one example of a type of desired state you might ask for from your cluster. That you can talk about, manipulate, manage any kind of resource- volumes, containers, endpoints, databases, queues, whatever- via the same consistent system, is enormously liberating. It takes longer to go from zero to one, but your jump from one to one hundred is much much smoother.
> but managing configuration is still going to be every-service-for-itself. Trying to understand things like observability & metrics are going to be highly different between systems
Literally none of this matters for a home server. I have a mail/web server that I haven’t had to change the configuration on since I last setup letsencrypt like 4 years ago. I don’t check metrics or have observability other than “does it work” and that does fine.
You’re caught up sucking in a bunch of technical debt preparing for something that simply doesn’t matter.
it takes less time to set up k3s & let's encrypt than it does to diy, under 30 mimutes.
for some people perhaps diy everything is a win, makes them feel better, but I intend to keep building, keep expanding what I do. having tech that has an actual management paradigm versus being cobbled together makes me feel much better about that future, about investing myself & my time, be it a little bit of time, or more.
i've done enough personal server moves to know that the old school automation i had, first puppet, then ansible, is still a lot of work to go run & coax back into action. but mostly, it just runs, leaves me with a bucket of bits, doesn't help manage at all.
> simply doesn’t matter
lot of ways to think about our cputomg environments and I am not in the "simply doesn't matter" camp.
maybe that applies to lots of people. they should take a spin at Kubernetes, i think it'll do an amazing amount of lifting for them & you can be up & running way faster.
This is not my experience.
My main house server runs:
mail: postfix, dovecot, clamav (SMTP, IMAP)
web: nginx, certbot, pelican, smokeping, dokuwiki, ubooquity, rainloop, privoxy (personal pages, blog, traffic tracking, wiki, comic-book server, webmail, anti-ad proxy)
git, postgresql, UPS monitoring, NTP, DNS, and DHCPd.
Firewalling, more DNS and the other part of DHCPd failover is on the router.
Package update is a breeze. The only time I waste the overhead of a virtual machine is when I'm testing out new configurations and don't want to break what I have.
"just having the Kubernetes server components running add a 10% CPU on my Intel(R) Atom(TM) CPU C2338 @ 1.74GHz."
Containerization is not a win here. Where's the second machine to fail over to?