configpolicy

Author	SHA1	Message	Date
Dustin C. Hatch	431b7dfacc	facts: Do not collect facts in first play The first play in the `facts.yml` playbook contains a single task: clear the existing fact cache. It makes no sense to gather facts for this play.	2023-10-27 17:40:50 -05:00
Dustin C. Hatch	7b23f6a4ac	r/winbind: Disable offline login by default The `winbind offline login` setting seems to cause issues when one of the domain controllers is offline. Rather than try the other DC, winbind seems to just "give up" and return NT_STATUS_NO_SUCH_USER for all authentication requests until the offline cache is flushed. There's not really any reason to use this setting on servers anyway, since they are always connected to the LAN, as opposed to laptops that may occasionally disconnect. Let's disable this option in the hopes that it makes logins more resilient to DC downtime. After all, there's not much point in having multiple DCs if they all have to be available in order to log in.	2023-10-27 17:37:49 -05:00
Dustin C. Hatch	686817571e	smtp-relay: Switch to Fastmail AWS is going to begin charging extra for routable IPv4 addresses soon. There's really no point in having a relay in the cloud anymore anyway, since a) all outbound messages are sent via the local relay and b) no messages are sent to anyone except me.	2023-10-24 17:27:21 -05:00
Dustin C. Hatch	d2eb61cce1	r/sudo: Tag install tasks Tasks that install packages need to be tagged as `install` so they can be skipped by Jenkins daily runs.	2023-10-21 22:16:28 -05:00
Dustin C. Hatch	7c6ed667be	r/system-auth: Tag install tasks Tasks that install packages need to be tagged as `install` so they can be skipped by Jenkins daily runs.	2023-10-21 22:16:28 -05:00
Dustin C. Hatch	6a6765ac06	r/system-auth: Remove uninstall authconfig task The authconfig package has been gone from Fedora since ages. There's no reason to have this no-op step any more, especially since it has the side-effect of making a network request to refresh the dnf cache.	2023-10-21 13:11:25 -05:00
Dustin C. Hatch	1b9543b88f	metricspi: alerts: Increase Frigate disk threshold We want the Frigate recording volume to be basically full at all times, to ensure we are keeping as much recording as possible.	2023-10-15 09:52:12 -05:00
Dustin C. Hatch	2f554dda72	metricspi: Scrape k8s-aarch64-n1 I've added a new Kubernetes worker node, k8s-aarch64-n1.pyrocufflink.blue. This machine is a Raspberry Pi CM4 mounted on a Waveshare CM4-IO-Base A and clipped onto the DIN rail. It's got 8 GB of RAM and 32 GB of eMMC storage. I intend to use it to build container images locally, instead of bringing up cloud instances.	2023-10-05 14:32:19 -05:00
Dustin C. Hatch	a74113d95f	metricspi: Scrape Zincati metrics from CoreOS hosts Zincati is the automatic update manager on Fedora CoreOS. It exposes Prometheus metrics for host/update statistics, which are useful to track the progress of automatic updates and identify update issues. Zinciti actually exposes its metrics via a Unix socket on the filesystem. Another process, [local_exporter], is required to expose the metrics from this socket via HTTP so Prometheus can scrape them. [local_exporter]: https://github.com/lucab/local_exporter	2023-10-03 10:29:12 -05:00
Dustin C. Hatch	d7f778b01c	metricspi: Scrape metrics from k8s-aarch64-n0 collectd is now running on k8s-aarch64-n0.pyrocufflink.blue, exposing system metrics. As it is not a member of the AD domain, it has to be explicitly listed in the `scrape_collectd_extra_targets` variable.	2023-10-03 10:29:11 -05:00
Dustin C. Hatch	50f4b565f8	hosts: Remove nvr1.p.b as managed system nvr1.pyrocufflink.blue has been migrated to Fedora CoreOS. As such, it is no longer managed by Ansible; its configuration is done via Butane/Ignition. It is no longer a member of the Active Directory domain, but it does still run collectd and export Prometheus metrics.	2023-09-27 20:24:47 -05:00
Dustin C. Hatch	e4c2b36dfd	r/scrape-collectd: Also scrape unmanaged targets The `scrape_collectd_extra_targets` variable can be used to specify a list of additional targets to scrape, in addition to the hosts in the collectd-prometheus group. This will allow us to scrape hosts that are not managed by the configuration policy, but still expose Prometheus metrics via collectd.	2023-09-27 20:24:47 -05:00
Dustin C. Hatch	d3799607ec	hosts: Move nvr1.p.b back to main inventory nvr1.pyrocufflink.blue is no longer offline.	2023-09-26 07:40:33 -05:00
Dustin C. Hatch	0037a3c281	r/minio: Reload server after changing cert MinIO is supposed to automatically reload itself when the certificate changes, but this does not appear to happen in all cases. To ensure the updated certificate gets used, we need to send SIGHUP to the MinIO server process.	2023-09-22 07:29:05 -05:00
Dustin C. Hatch	1b63332872	r/jellyfin: Restrict HTTPS redirect to Jellyfin Since Jellyfin is running on the file server, which also hosts a few other websites that do not define virtual hosts, the HTTP-to-HTTPS redirect was applied to all requests. To avoid this, we simply add a rewrite condition so that the redirect only applies to requests for Jellyfin.	2023-09-13 10:06:12 -05:00
Dustin C. Hatch	a2b3f9b5b9	jellyfin: Deploy Jellyfin media server Jellyfin is a multimedia library manager. Clients can browse and stream music, movies, and TV shows from the server and play them locally (including in the browser).	2023-09-12 13:38:35 -05:00
Dustin C. Hatch	226a6bef46	Revert "hosts: Move serial0.p.b offline" This reverts commit `9d29961b38`.	2023-08-07 11:41:06 -05:00
Dustin C. Hatch	9d29961b38	hosts: Move serial0.p.b offline It seems this machine has died and probably needs to be rebuilt.	2023-07-26 11:49:46 -05:00
Dustin C. Hatch	16d05fcfb4	hosts: Move nvr1.p.b offline This machine is offline until I get the cameras installed at the new house.	2023-07-26 11:48:38 -05:00
Dustin C. Hatch	7120e4ebf8	hosts: Decommission hass2.p.b Home Assistant is now hosted in Kubernetes.	2023-07-24 11:33:12 -05:00
Dustin C. Hatch	4cdb5dee70	certs/samba: Add missing symlink for dc-ag62kz.p.b	2023-07-24 08:36:20 -05:00
Dustin C. Hatch	7a9c678ff3	burp-server: Keep more backups New retention policy: * 7 daily backups * 4 weekly backups * 12 ~monthly backups * 5 ~yearly backups	2023-07-17 16:36:37 -05:00
Dustin C. Hatch	06782b03bb	vm-hosts: Update VM autostart list * dc2 is gone for a long time, replaced by two new domain controllers * unifi0 was recently replaced by unifi1	2023-07-07 10:05:22 -05:00
Dustin C. Hatch	6a5d1437e8	hosts: add unifi1.p.b unifi1.pyrocufflink.blue is a Fedora machine that hosts the Unifi Network controller software.	2023-07-07 10:05:01 -05:00
Dustin C. Hatch	71a43ccf07	unifi: Deploy Unifi Network controller Since Ubiquiti only publishes Debian packages for the Unifi Network controller software, running it on Fedora has historically been neigh impossible. Fortunately, a modern solution is available: containers. The linuxserver.io project publishes a container image for the controller software, making it fairly easy to deploy on any host with an OCI runtime. I briefly considered creating my own image, since theirs must be run as root, but I decided the maintenance burden would not be worth it. Using Podman's user namespace functionality, I was able to work around this requirement anyway.	2023-07-07 10:05:01 -05:00
Dustin C. Hatch	61844e8a95	pyrocufflink: Add Luma SSH keys for root Sometimes I need to connect to a machine when there is an AD issue (e.g. domain controllers are down, clocks are out of sync, etc.) but I can't do it from my desktop.	2023-07-05 16:35:57 -05:00
Dustin C. Hatch	9f221cf734	web/dustinandtabitha: Disable RSVP form The spammers have found our wedding RSVP form.	2023-06-27 09:02:54 -05:00
Dustin C. Hatch	0a68d84121	metricspi: Scrape hatchlearningcenter.org To monitor site availability and certificate expiration.	2023-06-21 14:31:33 -05:00
Dustin C. Hatch	4e608e379f	metricspi/alerts: Correct BURP archive alert query When the RAID array is being resynchronized after the archived disk has been reconnected, md changes the disk status from "missing" to "spare." Once the synchronization is complete, it changes from "spare" to "active." We only want to trigger the "disk needs archived" alert once the synchronization process is complete; otherwise, both the "disks need swapped" and "disk needs archived" alerts would be active at the same time, which makes no sense. By adjusting the query for the "disk needs archived" alert to consider disks in both "missing" and "spare" status, we can delay firing that alert until the proper time.	2023-06-20 11:58:35 -05:00
Dustin C. Hatch	b05edbf7fb	r/minio: Configure firewall The firewall needs to allow inbound connections to the MinIO HTTP API and web UI ports.	2023-06-08 10:07:32 -05:00
Dustin C. Hatch	4776303db2	k8s-node: Deploy NFS client Longhorn's new RWX (read-write many) mode requires the NFS client utilities installed on the host machine.	2023-06-08 10:06:02 -05:00
Dustin C. Hatch	679ea47bf7	r/homeassistant: Protect ~/.ssh When the Home Assistant container restarts, Podman relabels the entire `/var/lib/homeassistant` directory as `container_file_t`. Since the homeassistant user's home directory is `/var/lib/homeassistant`, its `~/.ssh` directory is thus also relabeled, preventing the SSH daemon from accessing it. Since Home Assistant itself does not need access to this path, we can tell systemd to mount an empty tmpfs filesystem there in the service unit's mount namespace. This way, when Podman relabels the directory, it will change the label of the tmpfs mount point instead of the actual directory.	2023-06-08 10:05:36 -05:00
Dustin C. Hatch	bf4d57b5cb	frigate: Configure journal2ntfy for MD RAID The Frigate server has a RAID array that it uses to store video recordings. Since there have been a few occasions where the array has suddenly stopped functioning, probably because of the cheap SATA controller, it will be nice to get an alert as soon as the kernel detects the problem, so as to minimize data loss.	2023-06-08 10:05:36 -05:00
Dustin C. Hatch	87e8ec2ed4	synapse: Back up data using BURP Most of the Synapse server's state is in its SQLite database. It also has a `media_store` directory that needs to be backed up, though. In order to back up the SQLite database while the server is running, the database must be in "WAL mode." By default, Synapse leaves the database in the default "rollback journal mode," which disallows multiple processes from accessing the database, even for read-only operations. To change the journal mode: ```sh sudo systemctl stop synapse sudo -u synapse sqlite3 /var/lib/synapse/homeserver.db 'PRAGMA journal_mode=WAL;' sudo systemctl start synapse ```	2023-05-23 09:52:50 -05:00
Dustin C. Hatch	74243080bb	r/burp-client: Support pre/post-restore scripts BURP can run scripts before and after restore. This may be useful, for example, to clean up files in a backup that may be in an inconsistent state.	2023-05-23 09:52:50 -05:00
Dustin C. Hatch	66d0a9157f	burp-client: Switch from cron to systemd timer systemd timer units are supported on all relevant OS versions now. There is no longer any reason to use cron.	2023-05-23 09:51:07 -05:00
Dustin C. Hatch	cd1f7b354b	ci: Add Jenkins pipeline for MinIO	2023-05-23 08:33:09 -05:00
Dustin C. Hatch	d26de78b3d	r/samba-dc: Rotate KDC log weekly The Samba KDC log file seems to grow rather quickly sometimes, outpacing the monthly rotation policy. Let's rotate it weekly and keep 4 historical versions.	2023-05-23 08:31:58 -05:00
Dustin C. Hatch	78296f7198	Merge branch 'journal2ntfy'	2023-05-23 08:31:52 -05:00
Dustin C. Hatch	347cda74fd	metrics: Scrape metrics from Kubernetes API server Kubernetes exports a lot of metrics in Prometheus format. I am not sure what all is there, yet, but apparently several thousand time series were added. To allow anonymous access to the metrics, I added this RoleBinding: ```yaml apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRole metadata: name: prometheus rules: - apiGroups: - "" resources: - nodes/metrics verbs: - get - nonResourceURLs: - /metrics verbs: - get ```	2023-05-22 21:21:08 -05:00
Dustin C. Hatch	c0bb387b18	metricspi: Scrape metrics from MinIO backup storage MinIO exposes metrics in Prometheus exposition format. By default, it requires an authentication token to access the metrics, but I was unable to get this to work. Fortunately, it can be configured to allow anonymous access to the metrics, which is fine, in my opinion.	2023-05-22 21:19:25 -05:00
Dustin C. Hatch	a7319c561d	journal2ntfy: Script to send log messagess via ntfy The `journal2ntfy.py` script follows the systemd journal by spawning `journalctl` as a child process and reading from its standard output stream. Any command-line arguments passed to `journal2ntfy` are passed to `journalctl`, which allows the caller to specify message filters. For any matching journal message, `journal2ntfy` sends a message via the ntfy web service. For the BURP server, we're going to use `journal2ntfy` to generate alerts about the RAID array. When I reconnect the disk that was in the fireproof safe, the kernel will log a message from the md subsystem indicating that the resynchronization process has begun. Then, when the disks are again in sync, it will log another message, which will let me know it is safe to archive the other disk.	2023-05-17 14:51:21 -05:00
Dustin C. Hatch	2c002aa7c5	alerts: Add alert to archive BURP disk This alert will fire once the MD RAID resynchronization process has completed and both disks in the array are online. It will clear when one disk is disconnected and moved to the safe.	2023-05-16 08:33:13 -05:00
Dustin C. Hatch	877dcc3879	alerts: Add alerts for missed client backups When BURP fails to even start a backup, it does not trigger a notification at all. As a result, I may not notice for a few days when backups are not happening. That was the case this week, when clients' backups were failing immediately, because of a file permissions issue on the server. To hopefully avoid missing backups for too long in the future, I've added two new alerts: * The no recent backups alert fires if there have not been any BURP backups recently. This may also fire, for example, if the BURP exporter is not working, or if there is something wrong with the BURP data volume. * The missed client backup alert fires if an active BURP client (i.e. one that has had at least one backup in the past 90 days) has not been backed up in the last 24 hours.	2023-05-14 11:48:36 -05:00
Dustin C. Hatch	a2bcd5ccbb	alerts: Adjust BURP RAID disk swap alert Using a 30-day window for the `tlast_change_over_time` function effectively "caps out" the value at 30 days. Thus, the alert reminding me to swap the BURP backup volume will never fire, since the value will never be greater than the 30-day threshold. Using a wider window resolves that issue (though the query will still produce inaccurate results beyond the window).	2023-05-14 11:38:00 -05:00
Dustin C. Hatch	ad9fb6798e	samba-dc: Omit tls cafile setting The `tls cafile` setting in `smb.conf` is not necessary. It is used for verifying peer certificates for mutual TLS authentication, not to specify the intermediate certificate authority chain like I thought. The setting cannot simply be left out, though. If it is not specified, Samba will attempt to load a file from a built-in default path, which will fail, causing the server to crash. This is avoided by setting the value to the empty string.	2023-05-10 08:28:49 -05:00
Dustin C. Hatch	5ebe10fb0b	Merge branch 'minio'	2023-05-10 08:05:03 -05:00
Dustin C. Hatch	a3ea838cac	burp-server: Deploy MinIO We're going to run MinIO on the BURP server to provide a backup target for the [Postgres Operator][0]/[WAL-E][1]. Although the Postgres Operator also supports backups via [WAL-G][2], which supports more backup targets like SFTP, the operator does not support restoring from those targets. As such, the best way to get fully-featured backups for the Postgres Operator, including environment cloning, etc., is to use S3. Since I absolutely do not want to store my backups "in the cloud," using MinIO seems a decent alternative. Running it on the BURP server allows the backups to be stored and rotated along with regular system backups. [0]: https://github.com/zalando/postgres-operator/ [1]: https://github.com/wal-e/wal-e [2]: https://github.com/wal-g/wal-g	2023-05-09 21:55:25 -05:00
Dustin C. Hatch	f54bc44a48	minio: Install and configure MinIO [MinIO][0] is an S3-compatible object storage server. It is designed to provide storage for cloud-native applications for on-premises deployments. MinIO has not been packaged for Fedora (yet?). As such, the best way to deploy it is usining its official container image. Here, we are using `podman-systemd-generator` (Quadlet) to generate a systemd service unit to manage the container process.	2023-05-09 21:37:46 -05:00
Dustin C. Hatch	9722fed1b8	metricspi: Scrape dustinandtabitha.com	2023-05-09 21:30:11 -05:00

... 3 4 5 6 7 ...

965 Commits