Let's run `updatebot` on Saturday morning, so I can apply the changes
over the weekend if I have time. If I don't, there's no harm in having
the PRs open for a few days until I can get to it during the week.
Restic backups are now stored in MinIO on _chromie.pyrocufflink.blue_.
All data have been migrated from _burp1.p.b_, which is being
decommissioned.
The instance of MinIO on _chromie_ uses a certificate signed by DCH CA,
rather than the _pyrocufflink.blue_ wildcard certificate signed by
ZeroSSL. As such, we need to configure `restic` to trust the DCH Root
CA certificate in order to use the MinIO S3 API.
The latest version of `updatebot` has two major changes:
1. Projects can encompass multiple images, eliminating the need for
multiple configuration files and CronJobs. Projects are now defined
in a YAML documen, since the data structure is very nested and is
cumbersome to express in TOML.
2. Pull requests can now include a diff of the resources that will
change if the PR is merged. This requires the `kubectl` and `diff`
programs (which are not currently included in the _updatebot_
container image, so we bind-mount them from the host) and permission
to compare the local manifests using the Kubernetes API. Oddly,
computing the diff requires permission to use the PATCH method, even
though the client is not requesting any changes. This is apparently
a long-standing bug ([issue #981][0]) that may or may not ever be
fixed.
[0]: https://github.com/kubernetes/kubectl/issues/981
`updatebot` is a script I wrote that automatically opens Gitea Pull
Requests to update container image references in Kubernetes resource
manifests. It checks Github or Docker Hub for the latest release and
updates manifests or Kustommization configuration files to point to the
current version. It then commits the changes and opens a pull request
in Gitea. When combined with ArgoCD automatic synchronization, this
makes updating Kubernetes-deployed applications as simple as clicking
the merge button in the Gitea PR.
To start with, we'll automate Home Assistant upgrades this way.
This template sensor will be migrated to a helper, since Home Assitant
removed the `forecast` attribute of weather sensors and now requires
calling an action (service) to get those data.
Now that the reverse proxy that handles requests from the Internet uses
TLS pass-through, the Ingress for _ntfy_ needs to recognize both the
internal and external name.
Now that the reverse proxy for Internet-facing sites uses TLS
passthrough, the certificate for the _darkchestofwonders.us_ Ingress
needs to be correct. Since Ingress resources can only use either the
default certificate (_*.pyrocufflink.blue_) or a certificate from their
same namespace, we have to move the Certificate and its corresponding
Secret into the _websites_ namespace. Fortunately, this is easy enoug
to do, by setting the appropriate annotations on the Ingress.
To keep the existing certificate (until it expires), I moved the Secret
manually:
```sh
kubectl get secret dcow-cert -o yaml | grep -v namespace | kubectl create -n websites -f -
```
The VM hosts are now managed by the "main" Ansible inventory and thus
appear in the host list ConfigMap. As such, they do not need to be
listed explicitly in the static targets list.
There's obviously a bug or something in `mqttmarionette` because it
occasionally gets "stuck" in a state where it is running but does
not reconnect to the MQTT broker. In such situations, it has to be
restarted (and even then it doesn't shut down correctly but has to
be killed with SIGKILL, usually). I have been doing this manually, but
with this shell script and a corresponding "shell command" integration
in Home Assistant, it can be done automatically. This is similar to
how Home Assistant restarts Mopidy on the living room stereo when it
gets into the same kind of state.
Some machines have the same volume mounted multiple times (e.g.
container hosts, BURP). Alerts will fire for all of these
simultaneously when the filesystem usage passes the threshold. To avoid
getting spammed with a bunch of messages about the same filesystem,
we'll group alerts from the same machine.
I'm not using Matrix for anything anymore, and it seems to have gone
offline. I haven't fully decommissioned it yet, but the Blackbox scrape
is failing, so I'll just disable that bit for now.
This machine never worked correctly; the USB-RS232 adapters would stop
working randomly (and of course it would be whenever I needed to
actually use them). I thought it was something wrong with the server
itself (a Raspberry Pi 3), but the same thing happened when I tried
using a Pi 4.
The new backup server has a plethora of on-board RS-232 ports, so I'm
going to use it as the serial console server, too.
I've rebuilt the Unifi Network controller machine (again);
*unifi3.pyrocufflink.blue* has replaced *unifi2.p.b*. The
`unifi_exporter` no longer works with the latest version of Unifi
Network, so it's not deployed on the new machine.
Zigbee2MQTT commits the cardinal sin of storing state in its
configuration file. This means the file has to be writable and thus
stored in persistent storage rather than in a ConfigMap. As a
consequence, making changes to the configuration when the application is
not running is rather difficult. Case in point: when I added the
internal alias for _mqtt.pyrocufflink.blue_ pointing to the in-cluster
service, Zigbee2MQTT became unable to connect to the broker because it
was using the node port instead of the internal port. Since it could
not connect to the broker, it refused to start, and thus the container
would not stay running long enough to fix the configuration to point
to the correct port.
Fortunately, Zigbee2MQTT also allows configuring settings via
environment variables, which can be managed with a ConfigMap. Luckily,
the values read from environment variables override those from the
configuration file, so pointing to the correct broker port with the
environment variable was sufficient to allow the application to start.
Having name overrides for in-cluster services breaks ACME challenges,
because the server tries to connect to the Service instead of the
Ingress. To fix this, we need to configure both _cert-manager_ and
_step-ca_ to *only* resolve names using the network-wide DNS server.
It turns out, `step ca renew` _can_ renew certificates without mTLS; it
has a `--mtls=false` command-line argument that configures it to use
a JWT signed by the certificate, instead of using the certificate at
the transport layer. This allows clients to renew their certificates
without needing another authentication mechanism, even with the
TLS-terminating proxy.
Invoice Ninja allows attaching documents to invoices, payments,
expenses, etc. Tabitha wants to use this feature to attach receipts for
her expenses, but the photos her phone takes of them are too large for
the default nginx client body limit. We can raise this limit on the
ingress, but we also need to raise it on the "inner" nginx.
The Invoice Ninja container is not designed to be immutable at all; it
makes a bunch of changes to its own contents when it starts up.
Notably, it copies the contents of the `public` and `storage`
directories from the container image to the persistent volume _and then
deletes the source_. Additionally, being a Laravel application, it
needs write access to its own code for caching, etc. Previously, the
`init.sh` script copied the entire `app` directory to a temporary
directory, and then the runtime container mounted that volume over the
top of the original location. This allowed the root filesystem of the
container to be read-only, while the `app` directory was still mutable.
Unfortunately, this makes the startup process incredibly slow, as it
takes a couple of minutes to copy the whole application. It's also
pretty pointless, because the application runs as an unprivileged
process, so it wouldn't have write access to the rest of the filesystem
anyway. As such, I've decided to remove the `readOnlyRootFilesytem`
restriction, and allow the container to run as upstream intends, albeit
begrudgingly.