The *netboot/jenkins-agent* Ansible role configures three NBD exports:
* A single, shared, read-only export containing the Jenkins agent root
filesystem, as a SquashFS filesystem
* For each defined agent host, a writable data volume for Jenkins
workspaces
* For each defined agent host, a writable data volume for Docker
Agent hosts must have some kind of unique value to identify their
persistent data volumes. Raspberry Pi devices, for example, can use the
SoC serial number.
The `-external.url` and `-external.alert.source` command line arguments
and their corresponding environment variables can be used to configure
the "Source" links associated with alerts created by `vmalert`.
The firewall hardware is too slow to run the *prometheus_speedtest*
program. It always showed *way* lower speeds than were actually
available. I've moved the service to the Kubernetes cluster and it
works a lot better there.
*mtrcs0.pyrocufflink.red* is a Raspberry Pi CM4 on a Waveshare
CM4-IO-BASE-B carrier board with a NVMe SSD. It runs a custom OS built
using Buildroot, and is not a member of the *pyrocufflink.blue* AD
domain.
*mtrcs0.p.r* hosts Victoria Metrics/`vmagent`, `vmalert`, AlertManager,
and Grafana. I've created a unique group and playbook for it,
*metricspi*, to manage all these applications together.
In addition to ignoring particular types of filesystems, e.g. OverlayFS,
we can also ignore filesystems by their mount point. This could be
useful, for example, for bind-mounted directories, such as those used on
Kubernetes nodes.
There is no specific playbook or role for Kubernetes. All OS
configuration is done at install time via kickstart scripts, and
deploying Kubernetes itself is done (manually) using `kubeadm init` and
`kubeadm join`.
Adding a second camera to the back yard, on the North side of the porch,
to try and figure out how the possums keep getting under the porch even
with the chicken wire around it!
We're trying to discover how the possums are getting into and out of the
house. Let's enable continuous video recording from the back yard
camera so we can observe them and come up with a plan to get rid of
them.
To handle the RSVP form on *dustinandtabitha.com*, we are going to use
*formsubmit*. It runs on the same machine that hosts the website, so
there's no dealing with CORS. The */submit/rsvp* path, which is proxied
to the backend, is the RSVP form's target.
The state history database is entirely too big. It takes over an hour
to create a backup of it, which usually causes BURP to time out. The
data it stores isn't particularly interesting anyway. Instead of trying
to back it up and ultimately not getting any backup at all, we'll just
skip it altogether to ensure we have a consistent backup of everything
else that is actually important.
If Nextcloud does not have the Internet-facing reverse proxy listed in
its "trusted proxies" setting, it will mark all traffic as being from
the proxy itself. This breaks brute force detection, etc.
Transitioning from push-based to pull-based monitoring with
Prometheus/collectd. The *write_prometheus* plugin will be installed on
all hosts, and Prometheus will be configured to scrape them directly.
Having this option enabled dramatically improves the reliability of
collectd multicast traffic from physical machines and VMs on a separate
VM host from the receiving machine.
1. Set a password for *root* on all machines (useful for logging in via
serial console if network is down)
2. Set an authorized SSH key for root on all machines:
* For Fedora 34, use my FIDO2 security token key
* For all other hosts, use my ED25519 key
Filesystems like NFS and CIFS require "helper" utilities (i.e.
`mount.nfs` and `mount.cifs`, respectively). These need to be installed
in order for a system to be able to mount those filesystems.
The current shared storage system uses NFSv4, and as such, the
*nfs-utils* package needs to be installed on the VM hosts.
Originally, the network configuration for the VM networks and the
storage network was configured using the *netifaces* role. This has
effectively stopped working in recent versions of Fedora, as it sort of
relied on `dhcpcd`, which has not been maintained in Fedora for a while
and no longer behaves correctly. After evaluating *NetworkManager* as a
replacement, I decided that *systemd-networkd* is a more appropriate
solution.
There are effectively two "layers" of network configuration needed for
the VM hosts: the host-specific settings, and the common settings. The
host-specific settings include such properties as the IP address of the
management interface and the names of the physical ports that make up
the bonded interfaces. The common settings are the bonded interfaces,
the VLAN interfaces created on top of the bond, and the bridges that
provide access to VMs.
To configure the host-specific settings, each host simply needs the
appropriate `networkd_*` variables in its `host_vars` file. For the
common settings, we apply the *systemd-networkd* role again in the
`vmhost.yml` with different values for these variables. Thus,
effectively, `systemd-networkd.yml` manages the host-specific settings,
while `vmhost.yml` manages the common settings.
I couldn't get RTMP to work on the Back Yard camera because the `ffmpeg`
process kept crashing:
```
ffmpeg.back_yard.clips_rtmp ERROR : av_interleaved_write_frame(): Connection reset by peer
ffmpeg.back_yard.clips_rtmp ERROR : [flv @ 0x5562090c8ec0] Failed to update header with correct duration.
ffmpeg.back_yard.clips_rtmp ERROR : [flv @ 0x5562090c8ec0] Failed to update header with correct filesize.
ffmpeg.back_yard.clips_rtmp ERROR : Error writing trailer of rtmp://127.0.0.1/live/back_yard: Connection reset by peer
watchdog.back_yard INFO : Terminating the existing ffmpeg process...
watchdog.back_yard INFO : Waiting for ffmpeg to exit gracefully...
```
I thought increasing the value of `--shm-size` argument for `podman`
would help, but even going as high as 1024 mebibytes did not resolve the
problem.
Ultimately, I decided that it is not really necessary to view the full
4k stream in real time. The back yard camera supports three streams, so
I set them all up for different roles. I briefly considered using a
single 1080p stream for both object detection and RTMP streaming, but
this consumed considerable CPU time, so I decided against it for now. I
may re-evaluate that option if I decide to purchase a TPU.
*nvr0.pyrocufflink.blue* hosts Frigate. It is deployed on a separate
subnet, for two reasons:
* To avoid streaming video from the cameras through the firewall
* To prevent any hosts on the LAN except Home Assistant from
communicating with Frigate, since it does not have any kind of
authentication or access control
For hosts that cannot send metrics via multicast (e.g. because they are
on a different subnet), *collectd* needs to listen on the all-hosts
unicast address.
ZwaveJS2Mqtt includes a very powerful web-based UI for configuring and
controlling the Z-Wave network. This functionality is no longer
available within Home Assistant itself, so being able to access the
ZwaveJS2Mqtt UI is crucial to operating the network.
I wanted to make the UI available at */zwave/*, which requires using
*mod_rewrite* to conditionally proxy requests based on the `Connection`
HTTP header, since the UI passes both HTTP and WebSocket requests to the
same paths. *mod_rewrite* configuration is not inherited from the main
server configuration to virtual hosts, so the
`RewriteRule`/`RewriteCond` directives have to be specified within the
`<VirtualHost>` block. This means that the Home Assistant proxy
configuration has to be within its own virtual host, and the
Zwavejs2Mqtt configuration has to be there as well.