Currently, I run Unraid and have all of my services’ setup there as docker containers. While this is nice and easy to setup initially, it has some major downsides:
- It’s fragile. Unraid is prone to bugs/crashes with docker that take down my containers. It’s also not resilient so when things break I have to log in and fiddle.
- It’s mutable. I can’t use any infrastructure-as-code tools like terraform, and configuration sort of just exist in the UI. I can’t really roll back or recover easily.
- It’s single-node. Everything is tied to my one big server that runs the NAS, but I’d rather have the NAS as a separate fairly low-power appliance and then have a separate machine to handle things like VMs and containers.
So I’m looking ahead and thinking about what the next iteration of my homelab will look like. While I like unraid for the storage stuff, I’m a little tired of wrangling it into a container orchestrator and hypervisor, and I think this year I’ll split that job out to a dedicated machine. I’m comfortable with, and in fact prefer, IaC over fancy UIs and so would love to be able to use terraform or Pulumi or something like that. I would prefer something multi-node, as I want to be able to tie multiple machines together. And I want something that is fault-tolerant, as I host services for friends and family that currently require a lot of manual intervention to fix when they go down.
So the question is: how do you all do this? Kubernetes, docker-compose, Hashicorp Nomad? Do you run k3s, Harvester, or what? I’d love to get an idea of what people are doing and why, so I can get some ideas as to what I might do.
I would stay away from kubernets/k3/k8s. Unless you want to learn it for work purposes, it’s so overkill you can spend a month before you get things running. I know from experience. My current setup gives you options and has been reliable for me.
NAS Box: Truenas Scale - You can have UnRaid fill this role.
Services Hosting: Proxmox - I can spin up any VMs I need and lots of info online to do things like hardware passthrough to VMs.
Containers: Debian VM - Debian makes a great server environment as it’s stable and well supported. I just make this VM a docker swarm host. I managed things with Portainer for a web interface.
I keep data on the NAS and have containers access it over the network. Usually a NFS share.
How do you manage your services on that, docker compose files? I’m really trying to get away from the workflow of clicking around in some UI to configure everything, only for it to glitch out and disappear and I have to try and remember what things to click to get it back. It was my main problem with portainer that caused me to move away from it (I have separate issues with docker-compose but that’s another thing)
I have a similar setup to the above. Personally I use Docker Compose and backup up my compose scripts to the NAS.
Same
I personally stepped away from compose. You mentioned that you want a more declarative setup. Give Ansible a try. It is primarily for config management, but you can easily deploy containerized apps and correlate configs, hosts etc.
I usually write roles for some more specialized setups like my HTTP reverse proxy, the arrs etc. Then I keep everything in my inventory and var files. I’m really happy and I really can tear things down and rebuild quickly. One thing to point out is that the compose module for Ansible is basically unusable. I use the docker container module instead. Works well so far and it keeps my containers running without restarting them unnecessarily.
Second there. Running kubernetes at home is great - to learn it for work.
If you don’t need to use it for work then you’re going to spend weeks if not months setting it up for very little payoff at home
Podman pods + systemd units to manage pods lifecycle. Ansible to deploy the base OS requirements, the ancillary services (SSH, backups, monitoring…), and the pods/containers/services themselves.
Proxmox. Currently considering upgrading from a single node to a 3 node Cluster for Ceph.
A plug for the pro Kubernetes crowd:
I run microk8s on a 3 node cluster, using FluxCD to deploy and manage my services. I also work with Kubernetes at work, so I’m very familiar with the concepts. But I will never use anything else.
If you want maximum control and flexibility, learn Kubernetes. For a lot of people (myself included) it’s overkill, but IMO it’s the best.
My main gripe with docker-compose, which is what I used to use, is that service changes require access to the machine. I have to run commands on the host to alter services. With Kubernetes, and more precisely a GitOps model, you can just make a commit to a git repo and it will roll out.
Yes very true, I really would much prefer GitOps as I feel… uneasy about how handwired and ephemeral my current setup is and would love it to be more declarative and idempotent. It does seem like Kubernetes is the way to do that.
FWIW I manage docker compose files with ansible. Allows me to centrally manage them without the need to go logging into multiple vms. I also create a systemd service file to start/stop the containers (also managed with ansible).
That said I’m starting to switch over to k8s as well (also with microk8s which has been the easiest to work with). Definitely overkill but I want to learn it.
I am happy with my simple docker-compose setup - one root folder with one subfolder per project containing the compose file and any configuration mounted into the container. Traefik automatically exposes all services I want under a well-known URL using a single line in each compose file. Watchtower updates the containers.
This has been running stable for over two years with probably 2-3 reboots in between. If my current NUC ever breaks I’ll set it up again using Podman instead of Docker, but aside from that I couldn’t be happier!
This seems like a sensible choice, but it would be a bit messy for multi-node which is the direction I’m heading in
I can’t remember what I was watching, but I remember watching something where they said Kubernetes is designed for something so large in scale that the only reason people have heard about it is because some product manager asked what Google use and then demanded that they use it to replicate the success of Google and subsequently, hobbyists also followed and now a bunch of people are using stuff that’s poorly optimized for such small scale systems.
I was familiar with just organise my docker-compose containers without any frontend. But I discovered casaOS, which make things pretty simple. An AppStore and a SMB-Shared File manager gave me a really good workflow. Things that aren’t on the AppStore can be handled outside of Casa, too.
PS. But never make the mistake to integrate the outside handled containers, this mess things up.
Thanks, yeah I’ve heard good things about casaOS. I think that I’m trying to move in the other direction though: fewer UI’s and more CLI’s + Configuration files.
In my opinion trying to set up a highly available fault tolerant homelab adds a large amount of unnecessary complexity without an equivalent benefit. It’s good to have redundancy for essential services like DNS, but otherwise I think it’s better to focus on a robust backup and restore process so that if anything goes wrong you can just restore from a backup or start containers on another node.
I configure and deploy all my applications with Ansible roles. It can programmatically create config files, pass secrets, build or start containers, cycle containers automatically after config changes, basically everything you could need.
Sure it would be neat if services could fail over automatically but things only ever tend to break when I’m making changes anyway.
This, I used to have a kubernetes setup but how much redudency can you really have at home. Do you have a generator? Multiple Internet lines?
The fact is most hardware is highly reliable. Having good backups to restore from is all you need and you gain a huge improvement in simplicity which adds reliability in and of itself.
I would say that if you are going to host it at home then kubenetes is more complex. Bare metal kubernetes control plane management has some pitfalls. But if you were to use a cloud provider like linode or digital ocean and use their kubernetes service, then only real extra complexity is learning how to manage Kubernetes which is minimal.
There is a decent hardware investment needed to run kubernetes if you want it to be fully HA (which I would argue means it needs to be a minimum of 2 clusters of 3 nodes each on different continents) but you could run a single node cluster with autoscaling at a cloud provider if you don’t need HA. I will say it’s nice not to have to worry about a service failing periodically as it will just transfer to another node in a few seconds automatically.