A Year of CoreOS Home Server

04 Jun 2022 ∼ 16 minutes / 3400 words

Fedora CoreOS has saved me. Believe me, it’s true – I was but a poor, lost soul, flitting through the hallways of self-hosting redemption that have seen so many a young folk driven to madness – all of this changed approximately a year ago, when I moved my home-server over to CoreOS, never (yet) to look back.

The journey that led me here was long and perilous, and filled with false twists and turns. The years from 2012 to 2017 were a simpler time, a mix of Ubuntu and services running directly on bare metal; so recent this was, that it might be atavism. Alas, this simplicity belied operational complexities and led to an unrelenting accumulation of cruft, so the years from 2017 to 2021 had me see the True Light of Kubernetes, albeit under a more minimal single-node setup with Minikube.

My early days with Kubernetes were carefree and filled with starry-eyed promises of a truly declarative future, so much so that I in turn declared my commitment to the world. It wasn’t long after until the rot set in, spurred by a number of issues, for example: Minikube will apparently set up local TLS certificates with a year’s expiration, after which kubectl will refuse to manage resources on the cluster, and which might cause the cluster to go belly up in the case of a reboot. And even with Kubernetes managing workloads, one still needs to have a way of setting up the host and cluster, for which there’s a myriad of self-proclaimed panaceas out there.

Clearly, the answer to complexity is even more complexity: simply sprinkle some Ansible on top and you’ve got yourself a stew. And to think there was a time where I entertained such harebrained notions.

At first, it was a twinkle, a passing glance. Fedora CoreOS doesn’t feature as large in the minds of those of us practicing the Dark Arts of Self-Hosting (though I’m hoping this changes as of this post), and is relegated to being marketed as experimental, nascent. Nothing could be further from the truth.

The pillars on which the CoreOS temple is built are three, each playing a complementary role in what makes the present gushing appropriate reading material:

Butane/Ignition, in which our host can be set up in a declarative manner and in a way which allows mere mortals such as myself to comprehend. The spec is short, read it.
Podman, in which containerized workloads are run. For many, Podman is simply a new, drop-in replacement for Docker, but it can be much more than that.
systemd, which needs little introduction, and in which all of our disparate orchestration needs are covered; service dependencies, container builds, one-time tasks, recurring tasks, all handled in all of their glorious complexities.

Tying the proverbial knot on top of these aspects is how much the system endeavours to stay out of your way, handling automatic updates and shipping with a rather hefty set of SELinux policies, ensuring that focus remains on the containers themselves.

Why Fedora CoreOS?

Before we head into the weeds, let’s try to address why you might even care about working with CoreOS; if anything, a bare-metal host will do well for most simple workloads, and Kubernetes isn’t all that unapproachable for a more complex single-node setup. How does CoreOS differentiate itself from other systems?

I can only really answer this from my own experience, by the main points that make CoreOS a worthwhile investment are:

The system is stable and robust, and is intended to be as hands-off as possible. This generally means you won’t have to worry about the base system itself across its entire life-cycle. One might argue that this is no different to any bare-metal system set up with auto-updates, though I’d personally never have these extend to system upgrades (and perhaps nothing beyond security updates).
The system is reasonably secure, and tries to make user interactions and workloads reasonably secure as well. This sometimes leads to inflexibility, as is the case with SELinux (which, if you’re not familiar with, is hell to try to understand), but the system has its way of keeping the user honest, which is a boon long-term.
The system has a good end-to-end deployment story, and is accompanied by excellent documentation. This generally means that you can rely on well-integrated workflows in testing, deploying, and updating your CoreOS-based system, and not have to resort to strange contortions or third-party/custom solutions in doing so.

Contrast these points with your typical bare-metal or Kubernetes-based setup (which is really just a layer above a bare-metal setup that you need to maintain separately):

Ubuntu and other similar efforts can be stable long-term (I probably stayed on the same LTS version of Ubuntu for 3 years), but this can lead to update stagnation and issues when time comes to move to the next major version of the OS. Most bare-metal OSs are designed to be managed as deeply as is necessary, which can lead to issues if there’s no discipline in change control.

In addition to this, keeping a Kubernetes cluster updated can be a full-time job for many, and even a single-node Minikube/K3s setup is not zero-maintenance by any means, and comes with its own set of perils.
As mentioned above, typical bare-metal setups tend to approach security in a less-than-holistic way, and give users the brunt of choice in deciding how to secure user workloads from the system, and vice-versa. Given how complicated security is, and how one shouldn’t connect a toaster, let alone an email server, to the internet for fear of having their house burn down, leaving these choices to the user may not work out for the long run.

Having user workloads run under Kubernetes improves the situation somewhat, as one is given a multitude of controls designed to separate and secure these from one another (e.g. network policies, CPU and memory limits); however, Kubernetes is also supremely complex, and is itself subject to esoteric security concerns.
Deployment, documentation, and upgrade concerns are typically rather disparate in other systems, and the quality of documentation varies wildly between communities. Kubernetes itself is well-documented, but remains complex and occupies a large surface area not typically needed for a simple home-server setup.

People tend to pick and choose solutions based on what their goals are, and the extent in which they’re comfortable learning about and maintaining these solutions long-term. If you’re looking for a system that is minimal, uses common components, and largely stays out of your way after deployment, CoreOS is a perfect middle-of-the-road solution.

Fedora CoreOS Basics

There’s a few things to keep in mind, going into a CoreOS-based setup:

The system is immutable, and you’re expected to use the system as-is, out-of-the-box, and without needing to rely on anything not installed by default. Don’t even think about reaching for rpm-ostree. In fact, don’t even think of storing anything outside of /var, and maybe /etc.
SELinux policies are pre-configured to be fairly restrictive, which means there’s quite a lot of functionality unavailable outside of interactive use – this includes things like using gpg in systemd services.
Although not in any way unstable, the Podman ecosystem is still moving fast and may not be as feature-complete as one might expect coming from Kubernetes, or even Docker.
CoreOS will auto-update even between major versions, and unless configured otherwise, will reboot as needed when new versions become available. Allowing for system reboots is good hygiene; embrace the chaos.

CoreOS comes with a sizeable amount of documentation of excellent quality which will be useful once you’re ready to get your hands dirty, but the rest of this spiel will instead focus on setting up a system based on the CoreOS Home-Server setup I depend on myself. Clone this locally and play along if you wish, though I’ll cut through the abstractions where possible, and explain the base concepts themselves.

With all that disclaimed and out the way, let’s kickstart this hunk of awesome.

Provisioning with Butane

Butane is a specification and related tooling for describing the final state of a new CoreOS-based system, using YAML as the base format; this is then compiled into a JSON-based format and used by a related system, called Ignition. Both systems follow similar semantics, but Butane is what you’ll use as you develop for your host.

Let’s imagine we’re looking to provision a bare-metal server with a with a unique hostname and set up for SSH access via public keys. Our Butane file might look like this:

variant: fcos
version: 1.4.0
passwd:
   users:
   - name: core
      ssh_authorized_keys:
      - ecdsa-sha2-nistp521 AAAAE2VjZHNhL...
storage:
  files:
  - path: /etc/hostname
     mode: 0644
     contents:
       inline: awesome-host

The default non-root user for CoreOS is aptly named core, so we add our SSH key there for convenience; Butane allows for creating an arbitrary amount of additional users, each with pre-set SSH keys, passwords, group memberships, etc.

In addition, we set our hostname not by any specific mechanism, but simply by creating the appropriate file with specific content – we could, alternatively, provide the path to a local or even a remote file (over HTTP or HTTPS). Simplicity one of Butane’s strengths, and you might find using the same basic set of directives for the vast amount of our requirements.

Place this under host/example/spec.bu if you’re using the coreos-home-server setup linked to above, or simply example.bu if not. Either way, these definitions are sufficient in having Butane produce an Ignition file, which we can then use in provisioning our imaginary CoreOS-based system. First, we need to run the butane compiler:

$ butane --strict -o example.ign example.bu

Then, we need to boot CoreOS and find a way of getting the example.ign file there. For bare-metal hosts, booting from physical media might be your first choice – either way, you’ll be dropped into a shell, waiting to install CoreOS based on a given Ignition file.

If you’re developing your Butane configuration on a machine that’s on the same local network as your home-server, you can use good ol’ GNU nc to serve the file:

# Assuming the local IP address is 192.168.1.5.
$ printf 'HTTP/1.0 200 OK\r\nContent-Length: %d\r\n\r\n%s\n' "$(wc -c < example.ign)" "$(cat example.ign)" | nc -vv -r -l -s 192.168.1.5

This mess of a shell command should print out a message confirming the listening address and random port assignment for our cobbled-together HTTP server. If you’re using coreos-home-server, all of this is handled by the deploy target, i.e.:

$ make deploy HOST=example VERBOSE=true

You’re then ready to refer to the HTTP URL for the host Ignition file based on the local IP address and random port assignment over at your live system:

$ sudo coreos-installer install --insecure-ignition --ignition-url http://192.168.1.5:31453 /dev/sda

Assuming all information is correct, you should now be well on your way towards installing CoreOS on the /dev/sda disk.

So, what do you do once you’re here? In short, nothing – the system, as shown in the above example, has configuration enough to give us SSH access into an otherwise bare system. CoreOS doesn’t come with much functionality other than the minimum needed to support its operations, and when I said the system is immutable, I meant it: you’re not supposed to re-apply Ignition configuration beyond first boot¹.

Instead, the blessed way of expanding the functionality of a CoreOS-based server is re-deploying it from scratch; we’ll bend this rule slightly, but it’s important to understand that we’re not intended to tinker too much with the installed system itself, as this would contradict the notion of repeatability built into CoreOS as a whole.

A simpler way of testing our changes is available to us by using virt-install, as described in this tutorial, or by using the deploy-virtual target:

$ make deploy-virtual HOST=example VERBOSE=true

This, again, is a major strength of CoreOS – alternative systems require the arrangement of a more complex and disparate set of components, in this case (most likely) something like Vagrant (in addition to, say, Ansible). Virtual hosts don’t only help in developing new integrations, but also allow us to experiment and test against the same versions of the OS that will end up running on the server itself.

Building and Running Container Services

Since the base system (deliberately) allows for little flexibility and customization, we have to explore alternative ways of extending functionality; in CoreOS, the blessed way of doing so is via Podman, a container orchestration system similar to (but not based on) Docker.

Typically, containers are presented as either methods of isolating sensitive services from the broader system alongside more traditional methods of software deployment, or as forming their own ecosystem of orchestration “above the metal”, as it were. Indeed, most distributions expect most software to be deployed via their own packaging system, and, at the other side of the spectrum, most Kubernetes cluster deployments don’t care what the underlying distribution is, assuming it fulfils some base requirements.

Fedora CoreOS stands somewhere in the middle, where Podman containers are indeed the sole reasonable method of software deployment, while not entirely divorcing this from the base system.

I had little knowledge of Podman coming into CoreOS; what I knew was that it’s essentially a drop-in replacement for Docker in many respects (including the container definition/Dockerfile format, technically part of Buildah), but integrates more tightly with Linux-specific features, and does not require a running daemon. This all remains true, and though the Podman ecosystem is still playing catch-up with Docker in a few ways (e.g. container build secrets), it has surpassed Docker in other ways (e.g. the podman generate and podman play suite of commands).

Podman and CoreOS will happily work with container images built and pushed to public registries, such as the Docker Hub, but we can also build these images ourselves with podman build; let’s start from the end here and set up a systemd service for Redis, running in its own container, under a file named redis.service:

[Unit]
Description=Redis Key-Value Store

[Service]
ExecStart=/bin/podman run --pull=never --replace --name redis localhost/redis:latest
ExecStop=/bin/podman stop --ignore --time 10 redis
ExecStopPost=/bin/podman rm --ignore --force redis

Though far from being a full example conforming to best practices, the above will suffice in showing how systemd and Podman mesh together; a production-ready service would have us use podman generate systemd or Quadlet (whenever this is integrated into CoreOS).

The service above refers to a localhost/redis image at the latest version, and also specifies --pull=never – this means that the service has no chance of running successfully, as the image referred to does not pre-exist; what we need is a way of having the required container image built before the service runs. What better way to do so than with systemd itself?

With the help of some naming and file-path conventions, we can have a templated systemd service handle container builds for us, and thus ensure container dependencies are maintained via systemd requirement and ordering directives. We’ll need an additional file called container-build@.service:

[Unit]
Description=Container Build for %I
Wants=network-online.target
After=network-online.target
ConditionPathExists=/etc/coreos-home-server/%i/Containerfile

[Service]
Type=oneshot
ExecStart=/bin/podman build --file /etc/coreos-home-server/%i/Containerfile --tag localhost/%i:latest /etc/coreos-home-server/%i

That final @ in the service file name denotes a templated systemd service, and allows us to use the service against a user-defined suffix, e.g. container-build@redis.service. We can then use this suffix via the %i and %I placeholders, as above (one has special characters escaped, the other is verbatim).

The Containerfile used by podman build is, for the most part, your well-trodden Docker container format, though the two systems might not always be at feature parity. In this case, we can cheat and just base on the official Docker image:

FROM docker.io/redis:6.2

The only thing left to do is extend our original redis.service file we with dependencies for the container-build@redis service:

[Unit]
Description=Redis Key-Value Store
Wants=container-build@redis.service
After=container-build@redis.service

Getting the files deployed to the host is simply a matter of extending our Butane configuration, i.e.:

variant: fcos
version: 1.4.0
systemd:
  units:
  - name: container-build@.service
    contents: |
      [Unit]
      ...      
  - name: redis.service
    enabled: true
    contents: |
      [Unit]
      ...      
storage:
  files:
  - path: /etc/coreos-home-server/redis/Containerfile
     mode: 0644
     contents:
       inline: "FROM docker.io/redis:6.2"

All of this is, of course, also provided in the coreos-home-server setup itself, albeit in a rather more modular way: service files are themselves placed in dedicated service directories and copied over alongside the container definitions, and can be conditionally enabled by merging the service-specific Butane configuration (e.g. in service/redis/spec.bu).

You might be wondering why you’d want to go through all this trouble of building images locally, when they’re all nicely available on some third-party platform; typically, the appeal here is the additional control and visibility this confers, but having our container definitions local to the server allows for some additional cool tricks, such as rebuilding container images automatically when the definition changes, via systemd path units:

[Unit]
Description=Container Build Watch for %I

[Path]
PathModified=/etc/coreos-home-server/%i/Containerfile
Unit=container-build@%i.service

Install this as container-build@.path and you’ll have the corresponding container image rebuilt whenever the container definition changes. You can even tie this up by having the service file depend on the path file, which will ensure the path file is active whenever the service file is used (even transitively via another service):

[Unit]
Description=Container Build for %I
Wants=network-online.target container-build@%i.path
After=network-online.target container-build@%i.path

The new container image will be used next time the systemd service is restarted – Podman has built-in mechanisms for restarting systemd units when new container image versions appear, though this requires that the services are annotated with a corresponding environment variable. See the documentation for podman auto-update for more information.

Extending Service Units

If there is one final trick up our sleeve, it’s drop-in units, which allow for partial modification of a concrete systemd unit.

Let’s imagine we wanted our Containerfile to access some protected resource that uses SSH key authentication for access (e.g. a private Github repository); to do so, we have to provide additional parameters to our podman build command, as used in our container-build@.service file.

Since there’s no real way of providing options beyond the templated unit suffix, and using an EnvironmentFile (or similar) directive means applying the same options to every instance of the templated unit, it might seem that we’d have to do away with our generic container-build service and use an ExecStartPre directive in the service unit itself.

Enter drop-in units: if, for the examples above, we create a partial systemd unit file under the container-build@redis.service.d directory with only the changes we want applied, we’ll get just that, and just for the specific instance of the templated unit (though it’s also possible to apply a drop-in for all instances). Given the following drop-in:

[Unit]
After=sshd-keygen@rsa.service

[Service]
PrivateTmp=true
ExecStartPre=/bin/install -m 0700 -d /tmp/.ssh
ExecStartPre=/bin/install -m 0600 /etc/ssh/ssh_host_rsa_key /tmp/.ssh/id_rsa
ExecStart=
ExecStart=/bin/podman build --volume /tmp/.ssh:/root/.ssh:z --file /etc/coreos-home-server/%i/Containerfile --tag localhost/%i:latest /etc/coreos-home-server/%i

The end-result here would be the same as if we had copied in the PrivateTmp and ExecStartPre directives into the original container-build@.service file, and extended the existing ExecStart directive with the --volume option. What’s more, we can further simplify our drop-in by implementing the original ExecStart directive with expansion in mind:

[Service]
Type=oneshot
ExecStart=/bin/podman build $PODMAN_BUILD_OPTIONS --file /etc/coreos-home-server/%i/Containerfile --tag localhost/%i:latest /etc/coreos-home-server/%i

Which would then have us remove ExecStart directives from the drop-in, and add an Environment directive:

[Service]
Environment=PODMAN_BUILD_OPTIONS="--volume /tmp/.ssh:/root/.ssh:z"

If not specified, the $PODMAN_BUILD_OPTIONS variable will simply expand to an empty string in the original service file, but will expand to our given options for the specific instance covered by the drop-in.

Since CoreOS hosts have stable identities, generated once at boot, we can add the public key in /etc/ssh/ssh_host_rsa_key.pub to our list of allowed keys in the remote system and have container image builds work reliably throughout the host’s lifetime.

Where to Go From Here?

There’s a lot more to cover, and clearly not all unicorns and rainbows – though the base system itself is simple enough, integration between containers makes for interesting challenges.

Future instalments will describe some of these complexities, as well as some of the more esoteric issues encountered. And, assuming it takes me another year to follow up on my promises, you’re welcome to follow along with progress on the repository itself.

This is different to systems such as Ansible, Chef, or Salt, where deployed hosts/minions are generally kept up-to-date in relation to upstream configuration ↩︎