I was doing this to monitor Jenkins's certificate, but since that's
managed by _cert-manager_, there's really practically no risk of it
expiring without warning anymore. Since Jenkins is already being
scraped directly, having this extra check just gernerates extra
notifications when there is an issue without adding any real value.
Using domain names in the "blackbox" probe makes it difficult to tell
the difference between a complete Internet outage and DNS issues. I
switched to using these names when I changed how the firewall routed
traffic to the public DNS servers, since those were the IP addresses
I was using to determine if the Internet was "up." I think it makes
sense, though, to just ping the upstream gateway for that check. If
EverFast changes their routing or numbering, we'll just have to update
our checks to match.
The alerts for Z-Wave device batteries in particular are pretty
annoying, as they tend to "flap" for some reason. I like having the
alerts show up on Alertmanager/Grafana dashboards, but I don't
necessarily need notifications about them. Fortunately, we can create a
special "none" receiver and route notifications there, which does
exactly what we want here.
Using Kustomize, we can define the configuration file separately from
the Kubernetes resources, and use `configMapGenerators` to generate the
ConfigMap for it. Additionally, this will make it possible to update
_ntfy_ using `updatebot`.
Tabitha wants to be able to accept Apple Pay payemnts via stripe, but
this requires an additional "domain verification" step. Apple needs to
make an HTTP request to the domain owned by the vendor, which in the
case of Invoice Ninja, must be the "app URL." Unfortunately, there
does not appear to be a way to tell Apple/Stripe/IN to use the client
portal domain or any other domain besides the app URL. Therefore, we
need to expose Invoice Ninja to the Internet under the public
_pyrocufflink.net_ domain, rather than the internal _pyrocufflink.blue_.
Let's run `updatebot` on Saturday morning, so I can apply the changes
over the weekend if I have time. If I don't, there's no harm in having
the PRs open for a few days until I can get to it during the week.
Restic backups are now stored in MinIO on _chromie.pyrocufflink.blue_.
All data have been migrated from _burp1.p.b_, which is being
decommissioned.
The instance of MinIO on _chromie_ uses a certificate signed by DCH CA,
rather than the _pyrocufflink.blue_ wildcard certificate signed by
ZeroSSL. As such, we need to configure `restic` to trust the DCH Root
CA certificate in order to use the MinIO S3 API.
The latest version of `updatebot` has two major changes:
1. Projects can encompass multiple images, eliminating the need for
multiple configuration files and CronJobs. Projects are now defined
in a YAML documen, since the data structure is very nested and is
cumbersome to express in TOML.
2. Pull requests can now include a diff of the resources that will
change if the PR is merged. This requires the `kubectl` and `diff`
programs (which are not currently included in the _updatebot_
container image, so we bind-mount them from the host) and permission
to compare the local manifests using the Kubernetes API. Oddly,
computing the diff requires permission to use the PATCH method, even
though the client is not requesting any changes. This is apparently
a long-standing bug ([issue #981][0]) that may or may not ever be
fixed.
[0]: https://github.com/kubernetes/kubectl/issues/981
`updatebot` is a script I wrote that automatically opens Gitea Pull
Requests to update container image references in Kubernetes resource
manifests. It checks Github or Docker Hub for the latest release and
updates manifests or Kustommization configuration files to point to the
current version. It then commits the changes and opens a pull request
in Gitea. When combined with ArgoCD automatic synchronization, this
makes updating Kubernetes-deployed applications as simple as clicking
the merge button in the Gitea PR.
To start with, we'll automate Home Assistant upgrades this way.
This template sensor will be migrated to a helper, since Home Assitant
removed the `forecast` attribute of weather sensors and now requires
calling an action (service) to get those data.
Now that the reverse proxy that handles requests from the Internet uses
TLS pass-through, the Ingress for _ntfy_ needs to recognize both the
internal and external name.
Now that the reverse proxy for Internet-facing sites uses TLS
passthrough, the certificate for the _darkchestofwonders.us_ Ingress
needs to be correct. Since Ingress resources can only use either the
default certificate (_*.pyrocufflink.blue_) or a certificate from their
same namespace, we have to move the Certificate and its corresponding
Secret into the _websites_ namespace. Fortunately, this is easy enoug
to do, by setting the appropriate annotations on the Ingress.
To keep the existing certificate (until it expires), I moved the Secret
manually:
```sh
kubectl get secret dcow-cert -o yaml | grep -v namespace | kubectl create -n websites -f -
```
The VM hosts are now managed by the "main" Ansible inventory and thus
appear in the host list ConfigMap. As such, they do not need to be
listed explicitly in the static targets list.
There's obviously a bug or something in `mqttmarionette` because it
occasionally gets "stuck" in a state where it is running but does
not reconnect to the MQTT broker. In such situations, it has to be
restarted (and even then it doesn't shut down correctly but has to
be killed with SIGKILL, usually). I have been doing this manually, but
with this shell script and a corresponding "shell command" integration
in Home Assistant, it can be done automatically. This is similar to
how Home Assistant restarts Mopidy on the living room stereo when it
gets into the same kind of state.
Some machines have the same volume mounted multiple times (e.g.
container hosts, BURP). Alerts will fire for all of these
simultaneously when the filesystem usage passes the threshold. To avoid
getting spammed with a bunch of messages about the same filesystem,
we'll group alerts from the same machine.
I'm not using Matrix for anything anymore, and it seems to have gone
offline. I haven't fully decommissioned it yet, but the Blackbox scrape
is failing, so I'll just disable that bit for now.
This machine never worked correctly; the USB-RS232 adapters would stop
working randomly (and of course it would be whenever I needed to
actually use them). I thought it was something wrong with the server
itself (a Raspberry Pi 3), but the same thing happened when I tried
using a Pi 4.
The new backup server has a plethora of on-board RS-232 ports, so I'm
going to use it as the serial console server, too.
I've rebuilt the Unifi Network controller machine (again);
*unifi3.pyrocufflink.blue* has replaced *unifi2.p.b*. The
`unifi_exporter` no longer works with the latest version of Unifi
Network, so it's not deployed on the new machine.
Zigbee2MQTT commits the cardinal sin of storing state in its
configuration file. This means the file has to be writable and thus
stored in persistent storage rather than in a ConfigMap. As a
consequence, making changes to the configuration when the application is
not running is rather difficult. Case in point: when I added the
internal alias for _mqtt.pyrocufflink.blue_ pointing to the in-cluster
service, Zigbee2MQTT became unable to connect to the broker because it
was using the node port instead of the internal port. Since it could
not connect to the broker, it refused to start, and thus the container
would not stay running long enough to fix the configuration to point
to the correct port.
Fortunately, Zigbee2MQTT also allows configuring settings via
environment variables, which can be managed with a ConfigMap. Luckily,
the values read from environment variables override those from the
configuration file, so pointing to the correct broker port with the
environment variable was sufficient to allow the application to start.