Setup ZFS for single-node Talos cluster

Published May 16, 2026

homelabtalos

The simplest option to persist data across pod restarts in a simple cluster is persistent volumes backed by the local path provisioner. When I wanted to do more with these volumes I quickly ran into limitations, specifically it was not possible to take volume snapshots.

ZFS is a great filesystem and offers snapshot functionality out of the box, with the right installation this can be used in Kubernetes as well. There are countless better options for a production cluster, but this has been working well in my simple Homelab! I did the initial setup by following this blog post by Mart Roosmaa, but wanted to extend on it.

Prerequisites

The following should be available in your Homelab or installed on your local machine:

A single machine/server with 1 disk.
talosctl cli (can be installed by following official instructions)

Note

This setup gives you snapshots on one node; if you need replication across nodes, look at Longhorn or Mayastor.

Support for RawVolumes used in this guide was added in Talos v1.11.0 so any version after that should be compatible. This guide is based on Talos v1.13.0!

Setup

What steps are necessary to configure the Talos Linux node to support ZFS?

Before you start

If your machine is not yet running Talos Linux, here’s a quick guide on installing Talos on a new machine.

Using Talos Linux Image Factory find the correct install image
Flash ISO to bootable stick with something like balenaEtcher
Wait for machine to boot and be ready in maintenance mode. At this point it’s easiest to connect a display directly to the machine to view logs and additional information like IP.
Store the IP of the machine in an environment variable so it can be reused later

export ENDPOINT="<IP>"

1. Create Config

Warning

Server should be running Talos, but Kubernetes cluster must not yet be bootstrapped. Volumes can only be configured on bootstrap!

Create Patch for Talos configuration, which will later be merged with the full Talos configuration when it’s generated. This enables overriding existing fields and adding new ones. The image hash should be 4dd8e3a8b6203d3c14f049da8db4d3bb0d6d3e70c5e89dfcc1e709e81914f63c to include the ZFS kernel module and is independent of the architecture and Talos versions.

Tip

You can verify it by the Talos Linux Image Factory yourself, just make sure to select the siderolabs/zfs System Extension.

machine:
  install:
    image: factory.talos.dev/metal-installer/4dd8e3a8b6203d3c14f049da8db4d3bb0d6d3e70c5e89dfcc1e709e81914f63c:v1.13.0
  kernel:
    modules:
    - name: zfs
cluster:
  allowSchedulingOnControlPlanes: true

allowSchedulingOnControlPlanes is required for single-node (only node is a controlplane node) clusters since by default no pods are scheduled on controlplane nodes.

Talos Linux has a set of System Volumes that can be configured and new User and Raw Volumes can be added for use later:

System Volume EPHEMERAL is used to store container data, logs and similar and by default uses all the space available on the disk.
Raw Volume openebs-zfs is used to store the ZFS pool and consequently persistent volume data.

Other System Volumes, like STATE, are created by Talos as well but are not relevant for this configuration.

Tip

If your machine has multiple disks you shouldn’t reuse the system disk! Both of the following volume configs use the same system disk since the assumption is that only a single disk is available.

The following examples are based on a single 1TB drive, so this sizing may be different for you. The EPHEMERAL volume needs to be restricted by using maxSize field so it doesn’t fill up the whole disk.

apiVersion: v1alpha1
kind: VolumeConfig
name: EPHEMERAL
provisioning:
  diskSelector:
    match: system_disk
  maxSize: 300GB
  grow: false

Reserve space on the same disk for the ZFS filesystem.

apiVersion: v1alpha1
kind: RawVolumeConfig
name: openebs-zfs
provisioning:
  diskSelector:
    match: system_disk
  minSize: 500GB

Raw Volumes in Talos create an unformatted partition for use by CSI Drivers, exactly what’s needed so it can be used by the ZFS volume provisioner. Since the ZFS pool is located on a partition that Talos does not manage, it’s also safe across restarts.

Store all these changes in a single file as controlplane-patch.yaml.

Complete patch file

machine:
  install:
    image: factory.talos.dev/metal-installer/4dd8e3a8b6203d3c14f049da8db4d3bb0d6d3e70c5e89dfcc1e709e81914f63c:v1.13.0
  kernel:
    modules:
      - name: zfs
cluster:
  allowSchedulingOnControlPlanes: true
---
apiVersion: v1alpha1
kind: VolumeConfig
name: EPHEMERAL
provisioning:
  diskSelector:
    match: system_disk
  maxSize: 300GB
  grow: false
---
apiVersion: v1alpha1
kind: RawVolumeConfig
name: openebs-zfs
provisioning:
  diskSelector:
    match: system_disk
  minSize: 500GB

Generate secrets to get a reproducible configuration that allows regenerating the whole config later and store the file securely (do not share this publicly).

talosctl gen secrets -o secrets.yaml

Create config for a new cluster named k8s by using talosctl with the secret.

talosctl gen config k8s "https://${ENDPOINT}:6443" \
  --with-secrets "secrets.yaml" \
  --config-patch-control-plane @"controlplane-patch.yaml" \
  --output-types controlplane,talosconfig \
  --output "talos" \
  --with-examples=false \
  --with-docs=false

This will store the generated config (controlplane.yaml as machine config to bootstrap the cluster and talosconfig to allow connecting and interacting with it from the local machine) in the talos directory relative to the current directory.

Tip

In the following section --nodes is passed as an argument to talosctl to tell Talos what address to talk to. Since this is a single node cluster with only a single IP, the talosconfig config file can be updated to remove the need to pass it every time.

context: k8s
contexts:
  k8s:
    endpoints:
      - <ENDPOINT>
    nodes:
      - <ENDPOINT>
    ca: ...
    crt: ...
    key: ...

2. Bootstrap Cluster

Apply the config to the machine. --insecure is needed here since the API is still in maintenance mode and doesn’t have a certificate yet.

talosctl apply-config --insecure --nodes "$ENDPOINT" \
  --file "talos/controlplane.yaml"

Bootstrap the cluster from the updated config.

talosctl bootstrap --nodes "$ENDPOINT"

Fetch kubeconfig so you can access the Kubernetes API.

talosctl kubeconfig --nodes "$ENDPOINT"

Wait for the node to be ready to continue.

kubectl get nodes -w

3. Configure ZFS

Create a privileged pod to modify the host’s filesystem. Since Talos doesn’t allow you to ssh onto the host, this is the only option to configure it. Namespace for the pod is not important, as long as it allows privileged pods.

kubectl run zfs-shell \
  --image=debian \
  --restart=Never \
  --overrides='{
    "spec": {
      "hostIPC": true,
      "hostNetwork": true,
      "hostPID": true,
      "containers": [{
        "name": "shell",
        "image": "debian",
        "command": ["sleep", "infinity"],
        "securityContext": {"privileged": true}
      }]
    }
  }'

Exec into the pod to run the following commands in a privileged context.

kubectl exec zfs-shell -it -- sh

Create the ZFS pool by running the command in the privileged container. Passing legacy tells ZFS that we’ll take care of mount points. The location that the pool is created at is the name of the RawVolumeConfig created during Talos config openebs-zfs, with r- prefixed.

nsenter --mount=/proc/1/ns/mnt -- zpool create \
  -m legacy \
  -f zfspv-pool \
  /dev/disk/by-partlabel/r-openebs-zfs

Tip

Pool name used is zfspv-pool and can be changed, it just needs to be the same name later.

Verify that the ZFS pool is online by running the following command in the privileged container.

nsenter --mount=/proc/1/ns/mnt -- zpool status

Expected status output

  pool: zfspv-pool
 state: ONLINE
  scan: none requested
config:

	NAME             STATE     READ WRITE CKSUM
	zfspv-pool       ONLINE       0     0     0
	  r-openebs-zfs  ONLINE       0     0     0

errors: No known data errors

Now ZFS is ready on the host and can be used in Kubernetes!

Clean up the privileged pod after you are done.

kubectl delete pod zfs-shell

4. Use ZFS in Kubernetes

Create the following values file for the OpenEBS umbrella chart with additional configuration for the OpenEBS zfs-localpv chart and store it as values.yaml. Most components can be disabled since they are not needed for ZFS to function.

preUpgradeHook:
  enabled: false
localpv-provisioner:
  analytics:
    enabled: false
  localpv:
    enabled: false
  hostpathClass:
    enabled: false
openebs-crds:
  csi:
    volumeSnapshots:
      enabled: true
      keep: true
zfs-localpv: 
  zfsNode: 
    encrKeysDir: /var/openebs/keys
lvm-localpv:
  enabled: false
mayastor:
  enabled: false
engines:
  local:
    lvm:
      enabled: false
    zfs: 
      enabled: true
  replicated:
    mayastor:
      enabled: false
loki:
  enabled: false
alloy:
  enabled: false
minio:
  enabled: false

Due to Talos not allowing the default directory to be writable, setting the encrKeysDir is required, otherwise the CSI Driver cannot register on the node.

Install the Helm chart. The command will wait for all pods to be in ready state and rollback if it takes too long (--rollback-on-failure since Helm v4, was --atomic in Helm v3).

helm install openebs oci://ghcr.io/openebs/dev/helm/openebs \
  --rollback-on-failure \
  -f values.yaml

poolname has to match with the name of the pool that was created in Configure ZFS.

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: host-zfs
provisioner: zfs.csi.openebs.io
allowVolumeExpansion: true
parameters:
  poolname: "zfspv-pool"
  recordsize: "128k"
  compression: "lz4"
  dedup: "off"
  fstype: "zfs"
  shared: "yes"

All these values can be left as is, but keep in mind that they may not be ideal depending on what’s running in the cluster. For example, recordsize: "128k" is appropriate for large sequential workloads but poor for databases. shared: "yes" allows same volume to be used by more than one pod, and can therefore be used to share configuration across pods.

On startup or reboot the Node registrar will check for the ZFS pool and import it for use in Kubernetes.

Usage

The name of previously created StorageClass has to be used as storageClassName for any PersistentVolumeClaims that use it.

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: config
spec:
  accessModes:
  - ReadWriteOnce
  resources:
    requests:
      storage: 10Gi
  storageClassName: host-zfs

You can check that the setup works as expected by creating the example PVC, then checking the status of the resource and making sure it’s Bound.

kubectl get pvc config

The data on the volume owned by this persistent volume claim will now be stored in the previously created ZFS pool on the host.

You can then create a VolumeSnapshotClass since the previously installed Helm chart already has everything required to use volume snapshots.

apiVersion: snapshot.storage.k8s.io/v1
kind: VolumeSnapshotClass
metadata:
  name: host-zfs-snapshot
driver: zfs.csi.openebs.io
deletionPolicy: Delete

With this VolumeSnapshotClass the persistent volumes now support volume snapshots and can be used for backups, for example by using VolSync and Kopia.

Summary

We’ve configured a new Talos Linux Cluster to use ZFS:

Configured Talos to provide enough space for a ZFS Pool
Created ZFS Pool on the Host
Used ZFS Pool with the OpenEBS Helm chart to back volumes

Tip

All resources can be found in chrismuellner/home-ops which manages my personal Kubernetes cluster!

Bonus

Maintenance

It’s recommended to regularly scrub all data in a ZFS pool to verify its correctness. This can be achieved by using scrub.

zpool scrub zfspv-pool

The easiest way to run this is via a privileged CronJob! This should be run regularly with output of zpool status checked and interpreted as well, but that’s beyond the scope of this post.

Tip

There’s an example of what this job could look like in Flux resources.