I’m rebuilding my home server in nixos.
Rather that configuring the various services natively in nixos, I decided to run containers via virtualisation.oci-containers whenever possible, mostly to be able to independently update the system and the various services.
Everything is going smoothly, but whenever I (for whatever reason) do nixos-rebuild boot and reboot after adding a container instead of nixos-rebuild switch, I run into this issue where podman isn’t able to resolve the host (below you see the docker hub host, but it also happened with ghcr.io):
podman-apprise-start[1352]: Trying to pull docker.io/caronc/apprise:1.1.8...
podman-apprise-start[1352]: Pulling image //caronc/apprise:1.1.8 inside systemd: setting pull timeout to 5m0s
podman-apprise-start[1352]: Error: initializing source docker://caronc/apprise:1.1.8: pinging container registry registry-1.docker.io: Get "https://registry-1.docker.io/v2/": dial tcp: lookup registry-1.docker.io: no such host
I thought that my podman-* services were missing a dependency on network-online and that they were started before the network was available, but it is’t the case:
# systemctl list-dependencies podman-apprise.service 
podman-apprise.service
● ├─system.slice
● ├─network-online.target
● │ └─systemd-networkd-wait-online.service
● └─sysinit.target
●   ├─dev-hugepages.mount
[...snip...]
Do you happen to know what the issue is?
PS: Manually running systemctl start podman-whatever once fixes the issue, of course, but I wonder if there’s a more robust solution?
update:
After investigating based on balsoft input below, the issue seems to be that systemd-networkd-wait-online doesn’t behave as expected (by me).
Basically, systemd-networkd-wait-online waits for network interfaces to have a carrier (working ethernet cable) and an IP address. This is what in systemd-networkd docs is called the “degraded” state (no, it doesn’t mean that something got worse than before… don’t think too much of what “degraded” implies in English).
In my case, I have an interface that is setup via DHCP and that also has static IPs assigned:
$ cat /etc/systemd/network/00-lan1.network 
[Match]
Name=lan1
[Network]
DHCP=ipv4
IPv6AcceptRA=no
LinkLocalAddressing=no
[Address]
Address=192.168.10.10/24
[Address]
Address=192.168.10.99/24
If you are wondering, the reason I do this is that I want static IPs for my dns server and reverse proxy, but I also want my home server to use DHCP to fetch some network-wide configuration which, critically, includes the default route.
Back to the issue: IIUC, since the interface has a non-link-local address (which systemd-networkd confusingly calls a “routable” address), it is immediately considered “routable” (a state that is moar better than “degraded”) and so not only it’s basically ignored by the default systemd-networkd-wait-online configuration, but even adding
[Link]
RequiredForOnline=routable
to /etc/systemd/network/00-lan1.network doesn’t make a difference whatsoever.
For now, my stopgap solution is to explicitly set the default route for the “lan1” network:
[Network]
Gateway=192.168.10.1
this seems to solve the issue with podman and, while the system still thinks to be “online” before being fully configured, it will suffice until I find a more elegant/robust way (ping me in a while if you are interested).
refs:
systemd-networkd-wait-online man page
systemd-networkd docs on “RequiredForOnline”
networkctl man page

