At some point, Firefly III added an `ALLOW_WEBHOOKS` option. It's set
to `false` by default, but it didn't seem to have any affect on
_running_ webhooks, only visiting the webhooks configuraiton page. Now,
that seems to have changed, and the setting needs to be enabled in order
for the webhooks to run.
I'm not sure why `disableNameSuffixHash` was set on the ConfigMap
generator. It shouldn't be, so that Kustomize can ensure the Pod is
restarted when the contents of the ConfigMap change.
This network policy blocks all outbound communication except to the
designated internal services. This will help prevent any data
exfiltration in the unlikely event the Firefly were to be compromised.
Rust Desk is a remote assistance software solution. The open source
edition is sufficient for what I want to do with it, namely: help Mom
and Dad troubleshoot issues on their PCs. Mom is currently having
trouble with the Nextcloud sync client, so I need to be able to help her
with that.
Sometimes, Grafana gets pretty slow, especially when it's running on one
of the Raspberry Pi nodes. When this happens, the health check may take
longer than the default timeout of 1 second to respond. This then marks
the pod as unhealthy, even though it's still working.
The `k8s-reboot-coordinator` coordinates node reboots throughout the
cluster. It runs as a DaemonSet, watching for the presence of a
sentinel file, `/run/reboot-needed` on the node. When the file appears,
it acquires a lease, to ensure that only one node reboots at a time,
cordons and drains the node, and then triggers the reboot by running
a command on the host. After the node has rebooted, the daemon will
release the lock and uncordon the node.
The `policy` Kustomize project defines various cluster-wide security
policies. Initially, this includes a Validating Admission Policy that
prevents pods from using the host's network namespace.
The _updatebot_ has been running with an old configuration for a while,
so while it was correctly identifying updates to ZWaveJS UI and
Zigbee2MQTT, it was generating overrides for the incorrect OCI image
names.
Buildroot jobs really benefit from having a persistent workspace volume
instead of an ephemeral one. This way, only the packages, etc. that
have changed since the last build need to be built, instead of the whole
toolchain and operating system.
As with AlertManager, the point of having multiple replicas of `vmagent`
is so that one is always running, even if the other fails. Thus, we
want to start the pods in parallel so that if the first one does not
come up, the second one at least has a chance.
If something prevents the first AlertManager instance from starting, we
don't want to wait forever for it before starting the second. That
pretty much defeats the purpose of having two instances. Fortunately,
we can configure Kubernetes to bring up both instances simultaneously by
setting the pod management policyo to `Parallel`.
We also don't need a 4 GB volume for AlertManager; even 500 MB is
way too big for the tiny amount of data it stores, but that's about the
smallest size a filesystem can be.
The `cert-exporter` is no longer needed. All websites manage their own
certificates with _mod_md_ now, and all internal applications that use
the wildcard certificate fetch it directly from the Kubernetes Secret.
_bw0.pyrocufflink.blue_ has been decommissioned since some time, so it
doesn't get backed up any more. We want to keep its previous backups
around, though, in case we ever need to restore something. This
triggers the "no recent backups" alert, since the last snapshot is over
a week old. Let's ignore that hostname when generating this alert.
The `vmagent` needs a place to spool data it has not yet sent to
Victoria Metrics, but it doesn't really need to be persistent. As long
as all of the `vmagent` nodes _and_ all of the `vminsert` nodes do not
go down simultaneously, there shouldn't be any data loss. If they are
all down at the same time, there's probably something else going on and
lost metrics are the least concerning problem.
The _dynk8s-provisioner_ only needs writable storage to store copies of
the AWS SNS notifications it receives for debugging purposes. We don't
need to keep these around indefinitely, so using ephemeral node-local
storage is sufficient. I actually want to get rid of that "feature"
anyway...
Although Firefly III works on a Raspberry Pi, a few things are pretty
slow. Notably, the search feature takes a really long time to return
any results, which is particularly annoying when trying to add a receipt
via the Receipts app. Adding a node affinity rule to prefer running on
an x86_64 machine will ensure that it runs fast whenever possible, but
can fall back to running on a Rasperry Pi if necessary.
The "cron" container has not been working correctly for some time. No
background tasks are getting run, and this error is printed in the log
every minute:
> `Target class [db.schema] does not exist`
It turns out, this is because of the way the PHP `artisan` tool works.
It MUST be able to write to the code directory, apparently to build some
kind of cache. There may be a way to cache the data ahead of time, but
I haven't found it yet. For now, it seems the only way to make
Laravel-based applications run in a container is to make the container
filesystem mutable.
Music Assistant doesn't expose any metrics natively. Since we really
only care about whether or not it's accessible, scraping it with the
blackbox exporter is fine.