Transitioning from push-based to pull-based monitoring with
Prometheus/collectd. The *write_prometheus* plugin will be installed on
all hosts, and Prometheus will be configured to scrape them directly.
Fedora 34 does not include the *ntp* package, as it has been "obsoleted
by ntpsec." Until I can create a role for *ntpsec*,
*dc2.pyrocufflink.blue* cannot be an NTP server.
*hass1.pyrocufflink.blue* and *hassdb0.pyrocufflink.blue* were part of
the old Home Assistant deployment. Everything has been migrated to
*hass2.pyrocufflink.blue*, so these machines can be decommissioned now.
*nvr0.pyrocufflink.blue* hosts Frigate. It is deployed on a separate
subnet, for two reasons:
* To avoid streaming video from the cameras through the firewall
* To prevent any hosts on the LAN except Home Assistant from
communicating with Frigate, since it does not have any kind of
authentication or access control
Frigate is an NVR that uses machine learning to detect objects on camera
in real time. It integrates with Home Assistant to expose sensors which
can be used for automation, etc.
The only official way to deploy Frigate is with a container, so we use
Podman and systemd to manage it.
*hass2.pyrocufflink.blue* is a Raspberry Pi Compute Module 4-based
system, currently mounted in a WaveShare CM4 Mini Base Board (A). With
an NVMe SSD for primary storage, it runs significantly faster than a
standard Raspberry Pi 4, and blows the old Raspberry Pi 3-based Home
Assistant deployment out of the water. It has a Zooz 700 series Z-Wave
Plus S2 USB stick and a ConBee II Zigbee USB stick attached to its USB
2.0 ports. It runs a customized Fedora Minimal distribution.
Although configuration policy is not yet available for Prometheus
itself, the `collectd.yml` playbook also uses the *prometheus* host
group. Specifically, hosts in this group are configured to receive
collectd data from other hosts and expose those data through the
`write_prometheus` plugin.
This commit introduces the *grafana* role and the corresponding
`grafana.yml` playbook. The role installs Grafana using the system
package manager, and configures the server (including LDAP
authentication).
The *synapse* role and the corresponding `synapse.yml` playbook deploy
Synapse, the reference Matrix homeserver implementation.
Deploying Synapse itself is fairly straightforward: it is packaged by
Fedora and therefore can simply be installed via `dnf` and started by
`systemd`. Making the service available on the Internet, however, is
more involved. The Matrix protocol mostly works over HTTPS on the
standard port (443), so a typical reverse proxy deployment is mostly
sufficient. Some parts of the Matrix protocol, however, involve
communication over an alternate port (8448). This could be handled by a
reverse proxy as well, but since it is a fairly unique port, it could
also be handled by NAT/port forwarding. In order to support both
deployment scenarios (as well as the hypothetical scenario wherein the
Synapse machine is directly accessible from the Internet), the *synapse*
role supports specifying an optional `matrix_tls_cert` variable. If
this variable is set, it should contain the path to a certificate file
on the Ansible control machine that will be used for the "direct"
connections (i.e. on port 8448). If it is not set, the default Apache
certificate will be used for both virtual hosts.
Synapse has a pretty extensive configuration schema, but most of the
options are set to their default values by the *synapse* role. Other
than substituting secret keys, the only exposed configuration option is
the LDAP authentication provider.
I doubt I will be using Koji much if at all any more. In preparation
for decommissioning it, I am moving the Koji inventory to hosts.offline,
to prevent Jenkins jobs from failing.
The *motioneye* role installs motionEye on a Fedora machine using `pip`.
It configures Apache to proxy for motionEye for outside (HTTPS) access.
The official installation instructions and default configuration for
motionEye assume it will be running as root. There is, however, no
specific reason for this, as it works just fine as an unprivileged user.
The only minor surprise is that the `conf_path` configuration setting
must be writable, as this is where motionEye places generated
configuration for `motion`. This path does not, however, have to
include the `motioneye.conf` file itself, which can still be read-only.
This commit adds a new playbook, `protonvpn.yml`, and its supporting
roles *strongswan-swanctl* and *protonvpn*. This playbook configures
strongSwan to connect to ProtonVPN using IPsec/IKEv2.
With this playbook, we configure the name servers on the Pyrocufflink
network to route all DNS requests through the Cloudflare public DNS
recursive servers at 1.1.1.1/1.0.0.1 over ProtonVPN. Using this setup,
we have the benefit of the speed of using a public DNS server (which is
*significantly* faster than running our own recursive server, usually by
1-2 seconds per request), and the benefit of anonymity from ProtonVPN.
Using the public DNS server alone is great for performance, but allows
the server operator (in this case Cloudflare) to track and analyze usage
patterns. Using ProtonVPN gives us anonymity (assuming we trust
ProtonVPN not to do the very same tracking), but can have a negative
performance impact if its used for all Internet traffic. By combining
these solutions, we can get the benefits of both!
Some hosts, such as the Raspberry Pis built using default Fedora images,
do not have proper filesystem separation, but use a single volume for
the entire filesystem. These hosts cannot have the root filesystem
mounted read-only, since all the writable data are also stored there.
When Jenkins runs configuration policy jobs, it always tries to remount
the root filesystem as read-only on every machine that it configured.
For these hosts with a single volume, this step fails, causing the job
to be marked as failed. To avoid this, I have added a new group,
*rw-root*; hosts in this group will be omitted from the final remount
step.
Normally, Home Assistant uses a SQLite database for storing state
history. On a Raspberry Pi with only an SD card for storage like
*hass1.pyrocufflink.blue*, this can become extremely slow, especially
for large data sets. To speed up features like history and logbook,
Home Assistant supports using an external database engine such as
PostgreSQL or MariaDB.
The *hassdb* role and corresponding `hassdb.yml` playbook deploys a
PostgreSQL server for Home Assistant to use. It needs only to create
the role and database, as Home Assistant manages its own schema.
*hass1.pyrocufflink.blue* is the new host for Home Assistant. I
migrated from using a virtual machine to using a Raspberry Pi to avoid
having to deal with USB passthrough for the Z-Wave USB stick.
*build1-aarch64* is a Raspberry Pi 3 B+ running Fedora aarch64. It is
intended to be used to build software and operating system images for
other aarch64 machines.
*rprx0.pyrocufflink.blue* is no longer in operation.
*web0.pyrocufflink.blue* handles incoming HTTP/HTTPS requests directly,
proxying to Bitwarden, OpenVPN, etc. as needed.
This commit updates the configuration for *pyrocufflink.net* to use the
wildcard certificate managed by *lego* instead of an unique certificate
managed by *certbot*.
The *nextcloud* role installs Nextcloud from the specified release
archive, downloading it to the control machine first if necessary, and
configures Apache and PHP-FPM to serve it.
The `nextcloud.yml` playbook uses the *cert* role to install the X.509
certificate for the Nextcloud server, sets up Apache HTTPD with the
*apache* role, and installs Nextcloud using the *nextcloud* role.
The host *cloud0.pyrocufflink.blue* is the Nextcloud server for
Pyrocufflink.
*burp1.pyrocufflink.blue* will replace *burp0.pyrocufflink.blue* as the
BURP server for Pyrocufflink. It is a physical machine (Fitlet), making
it simpler to manage the USB drives. The old virtual machine will be
decommissioned soon.
Having an empty (therefore undefined) group as the child of another
group causes Ansible to emit a "warning" (really an error) indicating
that it cannot parse the inventory file:
[WARNING]: * Failed to parse
/var/lib/jenkins/workspace/CfgMgmt/pyrocufflink/hosts with ini plugin:
/var/lib/jenkins/workspace/CfgMgmt/pyrocufflink/hosts:60: Section
[smtp- relay:children] includes undefined group: zabbix-server
This commit configures *bw0.pyrocufflink.blue* as a BURP client, so that
the Bitwarden data can be backed up. A pre-backup script is used to
take a consistent snapshot of the SQLite database before copying it to
the BURP server.
*cm0.pyrocufflink.blue* has been deprecated and shut down.
Configuration Management jobs now run on regular Jenkins nodes, and are
serialized using "lockable resources" instead of a single executor.
*dns1.pyrocufflink.blue* has been decommissioned. Having a second DNS
server never really worked correctly for some reason, and the
maintenance overhead of the Raspberry Pi is just not worth it right now.
The DHCP service has been moved to *dns0.pyrocufflink.blue*.
The point of the "wheel host" is to serve as a repository of Python
packages (wheels) built by Jenkins for consumption by `pip` et al. For
applications and libraries that do not provide all of their dependencies
as binary packages, this makes a convenient way to install them without
requiring all of the build tools and dependencies on the destination
machine.
The idea here is that a Jenkins job runs `pip wheel` for a distribution
package name or `requirements.txt` file and then uploads the resulting
wheel files using `rsync`. Apache is configured to serve the upload
directory with an index compatible with `pip`'s `--find-links`.
*hass0.pyrocufflink.blue* is a virtual machine that runs Home Assistant.
It is dual-homed on the *pyrocufflink.blue* network and the isolated IoT
network.
*vmhost0.pyrocufflink.blue* is currently offline for maintenance. To
avoid the unending stream of failed continuous enforcement Jenkins jobs,
it has been removed from the main inventory file and moved to "offline"
inventory.
The VPN capability of the UniFi Security Gateway is extremely limited.
It does not support road-warrior IPsec/IKEv2 configuration, and its
OpenVPN configuration is inflexible. As with DHCP, the best solution is
to simply move service to another machine.
To that end, I created a new VM, *vpn0.pyrocufflink.blue*, to host both
strongSwan and OpenVPN. For this to work, the necessary TCP/UDP ports
need to be forwarded, of course, and all of the remote subnets need
static routes on the gateway, specifying this machine as the next hop.
Additionally, ICMP redirects need to be disabled, to prevent confusing
the routing tables of devices on the same subnet as the VPN gateway.
The DHCP server on the UniFi Security Gateway is pretty limited; it
cannot manage static leases (reservations), and does not offer any way
to build dynamic values for e.g. hostname or boot filename. Rather than
give up these features, I decided to just move the DHCP server to one of
the Raspberry Pis; the DNS server made the most sense.
To facilitate this move, I created the *pyrocufflink-dhcp* host group,
and moved the DHCP configuration variables there. Thus, it was a simple
matter of adding *dns1.pyrocufflink.blue* to this group to relocate the
service.
Of course, to serve clients on the other subnets, the gateway needs to
have DHCP relay enabled and pointing to the new server.
The *aria2* role installs the *aria2* download manager and sets it up to
run as a system service with RPC enabled. It also sets up the web UI,
though that must be installed manually from an archive, for now.
To avoid having a single point of failure, a second recursive DNS server
is necessary. This will be useful in cases where the VM hosts must both
be taken offline, but Internet access is still required.
The new server, *dns1.pyrocufflink.blue*, has all the same zones defined
as the original. It forwards the *pyrocufflink.blue* zone and
corresponding reverse zones to the domain controllers, and acts as a
slave for the *pyrocufflink.red* zone.
*smtp1.pyrocufflink.blue* is a VM that will replace
*smtp0.pyrocufflink.blue*, a Raspberry Pi.
I decided that there is little use in having the availability guarantee of
a discreet machine for the SMTP relay. The only system that would NEED
to send mail if the VM host fails is Zabbix, which operates as its own
relay anyway. As such, the main relay can be a VM, and the Raspberry Pi
can be repurposed as a recursive DNS server.
The `koji.yml` playbook can be used to deploy an entire Koji ecosystem.
It is composed of three smaller playbooks:
* `koji-hub.yml`: Deploys the Koji hub, GC, and Kojira
* `koji-web.yml`: Deploys the Koji Web GUI
* `koji-builder.yml`: Deploys the Koji builder
The `burp-client.yml` and `burp-server.yml` playbooks apply the
*burp-client* and *burp-server* roles to BURP clients and servers,
respectively. The server playbook also applies the *postfix* role to
ensure that SMTP is configured and backup notifications can be sent.
Because *vmhost1.pyrocufflink.blue* is usually sleeping, continuous
enforcement jobs always fail. By keeping it in a separate inventory
file, configuration policy can still be applied to it manually, but it
will be ignored by continuous enforcement.
The gateway device is now monitored by Zabbix. Adding it to the *zabbix*
group ensures that the Zabbix agent is installed and configured
correctly.
Because the *zabbix-agent* role has a task to configure FirewallD, the
`host_uses_firewalld` variable needs to be set to `false` for *gw0*,
since it does not use FirewallD.
The host *zbx0.pyrocufflink.blue* (a Raspberry Pi) runs the Zabbix
server and web UI. It has a reserved IPv4 address to simplify reverse
DNS management for now, since Samba's dynamic DNS client does not
register PTR records.