The _updatebot_ has been running with an old configuration for a while, so while it was correctly identifying updates to ZWaveJS UI and Zigbee2MQTT, it was generating overrides for the incorrect OCI image names. |
||
---|---|---|
.. | ||
.gitignore | ||
README.md | ||
configuration.yaml | ||
event-snapshot.sh | ||
groups.yaml | ||
hass-k8s.plantuml | ||
hass-k8s.svg | ||
home-assistant.yaml | ||
ingress.yaml | ||
kustomization.yaml | ||
migrate.yaml | ||
mosquitto-cert.yaml | ||
mosquitto.conf | ||
mosquitto.yaml | ||
mqtt2vl.toml | ||
mqtt2vl.yaml | ||
namespace.yaml | ||
piper.yaml | ||
postgres-cert.yaml | ||
rest-command.yaml | ||
restart-diddy-mopidy.sh | ||
restart-kitchen-mqttmarionette.sh | ||
secrets.yaml | ||
shell-command.yaml | ||
shutdown-kiosk.sh | ||
ssh_known_hosts | ||
whisper.yaml | ||
zigbee2mqtt.env | ||
zigbee2mqtt.yaml | ||
zwavejs2mqtt.yaml |
README.md
Home Assistant
Originally, I tried to keep the Home Assistant ecosystem completely self-contained. Every component ran on one Raspberry Pi. The thought was that this would make it more resilient, so that network or infrastructure problems would be less likely to affect smart home operations. Ultimately, it turns out this actually made it noticeably less resilient, as the Raspberry Pi became a single point of failure for the whole system.
When we moved to the new house, Home Assistant was unavailable for several days, as I did not have a way to power and run the Raspberry Pi. Since none of the smart home devices were installed yet, we initially did not think this was an issue. We had forgotten to think about the shopping list and the chore tracker, though, and how much we have come to rely on them.
Given how quickly and seamlessly the applications deployed in Kubernetes came back online after the move, it suddenly made sense to move Home Assistant there as well.
Ecosystem
The Home Assistant ecosystem consists of these components:
- Home Assistant Core (API and Front-end)
- PostgreSQL (State history database)
- Mosquitto (MQTT server)
- Zigbee2MQTT (Zigbee integration)
- ZWaveJS2MQTT (ZWave integration)
- Piper (Text-to-speech)
- Whisper (Speech-to-text)
Each of these components runs in a container in separate pods within the home-assistant namespace.
Home Assistant Core
The core component of the Home Assistant ecosystem is the Home Assistant server itself. Only a single instance of the server can run within a given ecosystem, as Home Assistant is not cluster-aware. Home Assistant state is stored on the filesystem, so the server runs in a pod managed by a StatefulSet with a PersistentVolumeClaim.
The Home Assistant HTTP server, which hosts the UI, WebSocket, and REST API, is exposed by a Service resource, which in turn is proxied by an Ingress resource.
ConfigMaps
Although most Home Assistant configuration is managed by its web UI, some
settings and integrations are read from manually-managed YAML files. Some
notable examples include the Shell Command and Group integrations. To make
it easier to edit these files, they are stored in a ConfigMap which is mounted
into the Home Assistant container. Since the Kublet will not automatically
update mounted ConfigMaps when files are mounted individually, the entire
ConfigMap has to be mounted as a directory. Files that must exist within the
configuration directory (i.e. /config
) need symbolic links pointing to the
respective files in the ConfigMap mount point.
PostgreSQL
Although Home Assistant stores all of its internal state in JSON files on the filesystem, it uses a relational SQL database for state history. This gives it the ability to chart historical values for e.g. sensors, as well as provide the Logbook view. By default, Home Assistant uses a SQLite database file, stored on the filesystem alongside the other state files, but it also supports other RDBMS engines, including PostgreSQL. Using PostgreSQL instead of SQLite has a few advantages:
- More historical values can be retained without introducing performance issues
- Events can be recorded immediately instead of batched
- Backups and recovery are managed externally
PostgreSQL is not managed in directly in this deployment; rather, the Kustomization file patches the Home Assistant StatefulSet to provide environment variables pointing at an externally-managed PostgreSQL database. My Kubernetes cluster has a single PostgreSQL cluster, managed by the postgres operator, that hosts databases for several applications.
Mosquitto
Most of my custom integrations, including remote control of the heads-up displays, the chore list, and the Board Board™, are implemented using MQTT, as is Frigate. Thus, the Home Assistant ecosystem needs an MQTT message broker. Mosquitto is a lightweight but complete implementation, that works well with Home Assistant. It is extremely configurable, supporting various authentication, authorization, and access control mechanisms.
Home Assistant MQTT discovery relies heavily on retained MQTT messages, so enabling persistence for Mosquitto is very important. Without it, retained messages would be lost when the broker restarts, and all Home Assistant entities configured via MQTT discovery would be lost.
Since Mosquitto is not clustered and persists data to the filesystem, it is deployed as a StatefulSet with a PersistentVolumeClaim.
Zigbee2MQTT
Zigbee2MQTT provides a bridge between a Zigbee network and Home Assistant via MQTT. Zigbee devices communicate with the controller, which is attached to a server via USB. Messages received from devices are published to the message queue, and vice versa. Zigbee2MQTT stores its state on the filesystem, so the StatefulSet needs a PersistentVolumeClaim.
Zigbee2MQTT also exposes a web UI for configuration and administration of the Zigbee network. This UI is exposed by a Service and an Ingress, and protected by Authelia.
ZWaveJS2MQTT
Similar to Zigbee2MQTT, ZWaveJS2MQTT provides a bridge between a Z-Wave network and Home Assistant. While its name suggests it uses MQTT, this can actually be bypassed and Home Assistant can communicate directly with the ZWaveJS2MQTT server via a WebSocket connection.
ZWaveJS2MQTT has a web UI, which is exposed by a Service and an Ingress, protected by Authelia. It stores state on the filesystem, and thus requires a StatefulSet with a PersistentVolume Claim.
Piper/Whisper
Piper and Whisper provide the text-to-speech and speech-to-text capabilities, respectively, for Home Assistant Voice Control. These processes are designed to run as Add-Ons for Home Assistant OS, but work just fine as Kubernetes containers as well.
Piper and Whisper need mutable storage in order to download their machine learning models. Since the model data are downloaded automatically when the container starts, using ephemeral volumes is sufficient.
Raspberry Pi Node
While Home Assistant Core and Mosquitto can run on any node in the Kubernetes cluster, Zigbee2MQTT and ZWaveJS2MQTT obviously have to run on the node where their respective devices are attached. Originally, I had intended to run them as containers on a Raspberry Pi, managed by Podman. While I was setting this up, though, it occurred to me that that was not even necessary; Kubernetes has all the necessary functionality to run containers on a specific node and enable them to communicate with local hardware.
To that end, I have added a Raspberry Pi running Fedora CoreOS to the k8s
cluster and attached the Zigbee and Z-Wave radios to it. This node has two
special labels: node-role.kubernetes.io/zigbee-ctrl
and
node-role.kubernetes.io/zwave-ctrl
, indicating that it has the Zigbee and
Z-Wave controllers, respectively, attached to it. The Zigbee2MQTT and
ZWaveJS2MQTT pods have node selectors that match these labels, ensuring that
they are only scheduled on the correct node.
Since my Kubernetes cluster uses Longhorn for storage management, which exposes volumes to pods via iSCSI, no state is actually stored on the Raspberry Pi.
To prevent pods besides Zigbee2MQTT and ZWaveJS2MQTT from being scheduled on
the Raspberry Pi, it has a du5t1n.me/machine=raspberrypi:NoExecute
taint.
The Zigbee2MQTT and ZWaveJS2MQTT pods, as well as critical services that are
deployed on every node in the cluster via DaemonSet resources, such as Calico
and Longhorn, are configured with a toleration for this taint. All other
pods, which do not have such a toleration, will never be scheduled on this
node.