[ARA Records Ansible][0] is a results storage system for Ansible. It
provides a convenient UI for tracking Ansible playbooks and tasks. The
data are populated by an Ansible callback plugin.
ARA is a fairly simple Python+Django application. It needs a database
to store Ansible results, so we've connected it to the main PostgreSQL
database and configured it to connect and authenticate using mTLS.
Rather than mess with managing and distributing a static password for
ARA clients, I've configured Autheliad to allow anonymous access to
post data to the ARA API from within the private network or the
Kubernetes cluster. Access to the web UI does require authentication.
[0]: https://ara.recordsansible.org/
At some point this week, the front porch camera stopped sending video.
I'm not sure exactly what happened to it, but Frigate kept logging
"Unable to read frames from ffmpeg process." I power-cycled the camera,
which resolved the issue.
Unfortunately, no alerts were generated about this situation. Home
Assistant did not consider the camera entity unavailable, presumably
because Frigate was still reporting stats about it. Thus, I missed
several important notifications. To avoid this in the future, I have
enabled the "Camera FPS" sensors for all of the cameras in Home
Assistant, and added this alert to trigger when the reported framerate
is 0.
I really also need to get alerts for log events configured, as that
would also indicated there was an issue.
Zigbee2MQTT needs to be able to read and write to the serial device for
the ConBee II USB controller. I'm not exactly sure what changed, or how
it was able to access it before the recent update.
The _dialout_ group has GID 18 on Fedora.
Vaultwarden requires basically no configuration anymore. Older versions
needed some environment variables for configuring the WebSocket server,
but as of 1.31, WebSockets are handled by the same server as HTTP, so
even that is not necessary now. The only other option that could
potentially be useful is `ADMIN_TOKEN`, but it's optional. For added
security, we can leave it unset, which disables the administration
console; we can set it later if/when we actually need that feature.
Migrating data from the old server was pretty simple. The database is
pretty small, and even the attachments and site icons don't take up much
space. All-in-all, there was only about 20 MB to move, so the copy only
took a few seconds.
Aside from moving the Vaultwarden server itself, we will also need to
adjust the HAProxy configuration to proxy requests to the Kubernetes
ingress controller.
Jenkins that build Gentoo-based systems, like Aimee OS, need a
persistent storage volume for the Gentoo ebuild repository. The Job
initially populates the repository using `emerge-webrsync`, and then the
CronJob keeps it up-to-date by running `emaint sync` daily.
In addition to the Portage repository, we also need a volume to store
built binary packages. Jenkins job pods can mount this volume to make
binary packages they build available for subsequent runs.
Both of these volumes are exposed to use cases outside the cluster using
`rsync` in daemon mode. This can be useful for e.g. local builds.
The Raspberry Pi in the kitchen now has Firefox installed so we can use
it to control Home Assistant. By listing its IP address as a trusted
network, and assigning it a trusted user, it can access the Home
Assistant UI without anyone having to type a password. This is
particularly important since there's no keyboard (not even an on-screen
virtual one).
Moving the `trusted_networks` auth provider _before_ the `homeassistant`
provider changes the login screen to show a "log in as ..." dialog by
default on trusted devices. It does not affect other devices at all,
but it does make the initial login a bit easier on kiosks.
We don't need a notification about paperless not scheduling email tasks
every time there is a gap in the metric. This can happen in some
innocuous situations like when the pod restarts or if there is a brief
disruption of service. Using the `absent_over_time` function with a
range vector, we can have the alert fire only if there have been no
email tasks scheduled within the last 12 hours.
It turns out this alert is not very useful, and indeed quite annoying.
Many servers can go for days or even weeks with no changes, which is
completely normal.
Since transitioning to externalIPs for TCP services, it is no longer
possible to use the HTTP.01 ACME challenge to issue certificates for
services hosted in the cluster, because the ingress controller does not
listen on those addresses. Thus, we have to switch to using the DNS.01
challenge. I had avoided using it before because of the complexity of
managing dynamic DNS records with the Samba AD server, but this was
actually pretty to work around. I created a new DNS zone on the
firewall specifically for ACME challenges. Names in the AD-managed zone
have CNAME records for their corresponding *_acme-challenge* labels
pointing to this new zone. The new zone has dynamic updates enabled,
which _cert-manager_ supports using the RFC2136 plugin.
For now, this is only enabled for _rabbitmq.pyrocufflink.blue_. I will
transition the other names soon.
Since the IP address assigned to the ingress controller is now managed
by keepalived and known to Kubernetes, the network policy needs to allow
access to it by pod namespace rather than IP address. It seems that the
former takes precedence over the latter, so even though the IP address
was explicitly allowed, traffic was not permitted because it was
destined for a Kubernetes service that was not.