1
0
Fork 0
Commit Graph

490 Commits (dc835ddc9df807a66571b7cf7b270a34f903bba4)

Author SHA1 Message Date
Dustin dc835ddc9d v-m/alerts: Fix PostgreSQL WAL archive failed alert
The `pg_stat_archiver_failed_count` metric is a counter, so once a WAL
archival has failed, it will increase and never return to `0`.  To
ensure the alert is resolved once the WAL archival process recovers, we
need to use the `increase` function to turn it into a gauge.  Finally,
we aggregate that gauge with `max_over_time` to keep the alert from
flapping if the WAL archive occurs less frequently than the scrape
interval.
2025-02-05 10:42:35 -06:00
Dustin f637feba16 updatebot: Fix tag format for Vaultwarden
We're using the Alpine variant of the Vaultwarden container images,
since the default variant is significantly larger and we do not need any
of the extra stuff it includes.
2025-02-01 18:29:54 -06:00
Dustin 6da330f2be v-m/scrape: Remove k8s SD config for Zincati
There are no more Kubernetes nodes running Fedora CoreOS.
2025-02-01 18:16:10 -06:00
Dustin 11a0f84db7 v-m/scrape: Remove websites job
Websites are being scraped by the `vmagent` on the OVH VPS.
2025-02-01 18:16:10 -06:00
Dustin 79995801e2 jenkins: ssh_known_hosts: Add OVH VPS host key 2025-02-01 18:16:10 -06:00
Dustin 759d8f112f ansible: Deploy ARA
[ARA Records Ansible][0] is a results storage system for Ansible.  It
provides a convenient UI for tracking Ansible playbooks and tasks.  The
data are populated by an Ansible callback plugin.

ARA is a fairly simple Python+Django application.  It needs a database
to store Ansible results, so we've connected it to the main PostgreSQL
database and configured it to connect and authenticate using mTLS.

Rather than mess with managing and distributing a static password for
ARA clients, I've configured Autheliad to allow anonymous access to
post data to the ARA API from within the private network or the
Kubernetes cluster.  Access to the web UI does require authentication.

[0]: https://ara.recordsansible.org/
2025-02-01 18:16:10 -06:00
Dustin 32175156ac sshca: Add machine ID for node-474c83.k8s.p.bk 2025-02-01 18:16:10 -06:00
Dustin a87b53e3ac v-m: Add alert for Frigate camera no video
At some point this week, the front porch camera stopped sending video.
I'm not sure exactly what happened to it, but Frigate kept logging
"Unable to read frames from ffmpeg process."  I power-cycled the camera,
which resolved the issue.

Unfortunately, no alerts were generated about this situation.  Home
Assistant did not consider the camera entity unavailable, presumably
because Frigate was still reporting stats about it.  Thus, I missed
several important notifications.  To avoid this in the future, I have
enabled the "Camera FPS" sensors for all of the cameras in Home
Assistant, and added this alert to trigger when the reported framerate
is 0.

I really also need to get alerts for log events configured, as that
would also indicated there was an issue.
2025-02-01 18:16:10 -06:00
Dustin 5065e61a2d Merge pull request 'home-assistant: Update to 2025.1.4' (#43) from updatebot/home-assistant into master
Reviewed-on: #43
2025-01-25 14:44:49 +00:00
Dustin 39298e9fea Merge pull request 'paperless-ngx: Update to 2.14.5' (#44) from updatebot/paperless-ngx into master
Reviewed-on: #44
2025-01-25 14:44:41 +00:00
bot b32751bf28 paperless-ngx: Update to 2.14.5 2025-01-25 12:32:13 +00:00
bot 4ce258b00c home-assistant: Update to 2025.1.4 2025-01-25 12:32:06 +00:00
Dustin 294c0230bf home-assistant: Update IP kitchen kiosk IP address
I got a new 2GB Raspberry Pi 4 Model B for the kitchen.  That way, I can
use the 4GB one for something that needs more memory.
2025-01-23 18:00:17 -06:00
Dustin 183bb28c12 authelia: Allow anonymous access to vminsert
This way we can have push-based metrics without requiring any
authentication.
2025-01-19 09:47:28 -06:00
Dustin ce7d90d704 Merge pull request 'zwavejs2mqtt: Update to 9.29.1' (#41) from updatebot/home-assistant into master
Reviewed-on: #41
2025-01-18 15:46:05 +00:00
Dustin 91f0432061 Merge pull request 'paperless-ngx: Update to 2.14.3' (#42) from updatebot/paperless-ngx into master
Reviewed-on: #42
2025-01-18 15:45:52 +00:00
bot 5fb6d70f59 paperless-ngx: Update to 2.14.3 2025-01-18 12:32:13 +00:00
bot 511a9df619 zwavejs2mqtt: Update to 9.29.1 2025-01-18 12:32:08 +00:00
Dustin e426bcf550 Merge pull request 'gotenberg: Update to 8.15.2' (#38) from updatebot/paperless-ngx into master
Reviewed-on: #38
2025-01-11 16:27:50 +00:00
Dustin 509c44d9cc Merge pull request 'authelia: Update to 4.38.18' (#40) from updatebot/authelia into master
Reviewed-on: #40
2025-01-11 16:27:21 +00:00
Dustin 4ac1bab968 h-a: zigbee2m: Add dialout supplemental group
Zigbee2MQTT needs to be able to read and write to the serial device for
the ConBee II USB controller.  I'm not exactly sure what changed, or how
it was able to access it before the recent update.

The _dialout_ group has GID 18 on Fedora.
2025-01-11 10:10:44 -06:00
Dustin 1674bc3b89 Merge pull request 'home-assistant: Update to 2025.1.0' (#39) from updatebot/home-assistant into master
Reviewed-on: #39
2025-01-11 15:57:26 +00:00
bot 4a197bf91a authelia: Update to 4.38.18 2025-01-11 12:32:12 +00:00
bot 07ffcd0bc5 gotenberg: Update to 8.15.3 2025-01-11 12:32:11 +00:00
bot e567c34df5 zigbee2mqtt: Update to 2.0.0 2025-01-11 12:32:06 +00:00
bot a8528302ee home-assistant: Update to 2025.1.2 2025-01-11 12:32:05 +00:00
Dustin 94be854bd7 vaultwarden: Deploy, migrate Vaultwarden
Vaultwarden requires basically no configuration anymore.  Older versions
needed some environment variables for configuring the WebSocket server,
but as of 1.31, WebSockets are handled by the same server as HTTP, so
even that is not necessary now.  The only other option that could
potentially be useful is `ADMIN_TOKEN`, but it's optional.  For added
security, we can leave it unset, which disables the administration
console; we can set it later if/when we actually need that feature.

Migrating data from the old server was pretty simple.  The database is
pretty small, and even the attachments and site icons don't take up much
space.  All-in-all, there was only about 20 MB to move, so the copy only
took a few seconds.

Aside from moving the Vaultwarden server itself, we will also need to
adjust the HAProxy configuration to proxy requests to the Kubernetes
ingress controller.
2025-01-10 20:05:18 -06:00
Dustin 1392a7c181 jenkins: Add storage for Gentoo Portage/binpkgs
Jenkins that build Gentoo-based systems, like Aimee OS, need a
persistent storage volume for the Gentoo ebuild repository. The Job
initially populates the repository using `emerge-webrsync`, and then the
CronJob keeps it up-to-date by running `emaint sync` daily.

In addition to the Portage repository, we also need a volume to store
built binary packages.  Jenkins job pods can mount this volume to make
binary packages they build available for subsequent runs.

Both of these volumes are exposed to use cases outside the cluster using
`rsync` in daemon mode.  This can be useful for e.g. local builds.
2025-01-09 20:15:46 -06:00
Dustin 75e6f7ee16 home-assistant: Add trusted user for Kitchen kiosk
The Raspberry Pi in the kitchen now has Firefox installed so we can use
it to control Home Assistant.  By listing its IP address as a trusted
network, and assigning it a trusted user, it can access the Home
Assistant UI without anyone having to type a password.  This is
particularly important since there's no keyboard (not even an on-screen
virtual one).

Moving the `trusted_networks` auth provider _before_ the `homeassistant`
provider changes the login screen to show a "log in as ..." dialog by
default on trusted devices.  It does not affect other devices at all,
but it does make the initial login a bit easier on kiosks.
2025-01-04 07:19:39 -06:00
Dustin 252dcfedc8 sshca: Add machine ID for ctrl-pi-spellbind 2024-12-28 10:38:26 -06:00
Dustin 6883ab41bd Merge remote-tracking branch 'refs/remotes/origin/master' 2024-12-21 20:23:42 -06:00
Dustin 8374e1e28b Merge remote-tracking branch 'refs/remotes/origin/master' 2024-12-21 20:23:25 -06:00
Dustin a74f7f64ad Merge remote-tracking branch 'refs/remotes/origin/master' 2024-12-21 20:22:36 -06:00
Dustin 60f88c6960 Merge remote-tracking branch 'refs/remotes/origin/master' 2024-12-21 20:21:04 -06:00
Dustin 21dcd853c4 Merge pull request 'home-assistant: Update to 2024.11.3' (#35) from updatebot/home-assistant into master
Reviewed-on: #35
2024-12-21 20:27:26 +00:00
Dustin b9d69ec0a3 v-m/alerts: Ignore missing backups from Toad, Luma
Toad and Luma can go offline for several days at a time if I don't use
them.  I don't need an alert telling me this.
2024-12-21 12:23:19 -06:00
Dustin a03d63841d v-m/alerts: Fire paperless email alert after 12h
We don't need a notification about paperless not scheduling email tasks
every time there is a gap in the metric.  This can happen in some
innocuous situations like when the pod restarts or if there is a brief
disruption of service.  Using the `absent_over_time` function with a
range vector, we can have the alert fire only if there have been no
email tasks scheduled within the last 12 hours.
2024-12-21 12:17:45 -06:00
Dustin d04c18cfcd v-m/alerts: Remove 'no file changes' alert
It turns out this alert is not very useful, and indeed quite annoying.
Many servers can go for days or even weeks with no changes, which is
completely normal.
2024-12-21 12:14:11 -06:00
Dustin 6e15b11f73 Merge branch 'fix-nextcloud-alert' 2024-12-21 11:58:41 -06:00
Dustin db37e5a691 Merge remote-tracking branch 'refs/remotes/origin/master' 2024-12-21 11:58:07 -06:00
Dustin 7a9adc642c Merge pull request 'firefly-iii: Update to 6.1.24' (#37) from updatebot/firefly-iii into master
Reviewed-on: #37
2024-12-21 17:39:21 +00:00
Dustin 93e42421e6 Merge pull request 'gotenberg: Update to 8.14.1' (#36) from updatebot/paperless-ngx into master
Reviewed-on: #36
2024-12-21 17:38:50 +00:00
bot a79668dcf1 gotenberg: Update to 8.14.1 2024-12-21 12:32:10 +00:00
bot 1c4b5e19a4 firefly-iii: Update to 6.1.25 2024-12-21 12:32:08 +00:00
bot 2691b58c05 zwavejs2mqtt: Update to 9.29.0 2024-12-21 12:32:04 +00:00
bot 50459e111e zigbee2mqtt: Update to 1.42.0 2024-12-21 12:32:04 +00:00
bot 387b7d120e whisper: Update to 2.4.0 2024-12-21 12:32:04 +00:00
bot 1768778b44 home-assistant: Update to 2024.12.5 2024-12-21 12:32:03 +00:00
Dustin 2b6830f131 cert-manager: Configure ACME DNS.01 for dch-ca
Since transitioning to externalIPs for TCP services, it is no longer
possible to use the HTTP.01 ACME challenge to issue certificates for
services hosted in the cluster, because the ingress controller does not
listen on those addresses.  Thus, we have to switch to using the DNS.01
challenge.  I had avoided using it before because of the complexity of
managing dynamic DNS records with the Samba AD server, but this was
actually pretty to work around.  I created a new DNS zone on the
firewall specifically for ACME challenges.  Names in the AD-managed zone
have CNAME records for their corresponding *_acme-challenge* labels
pointing to this new zone.  The new zone has dynamic updates enabled,
which _cert-manager_ supports using the RFC2136 plugin.

For now, this is only enabled for _rabbitmq.pyrocufflink.blue_.  I will
transition the other names soon.
2024-12-09 17:58:43 +00:00
Dustin 4243823ba5 invoice-ninja: Fix network policy for ingress
Since the IP address assigned to the ingress controller is now managed
by keepalived and known to Kubernetes, the network policy needs to allow
access to it by pod namespace rather than IP address.  It seems that the
former takes precedence over the latter, so even though the IP address
was explicitly allowed, traffic was not permitted because it was
destined for a Kubernetes service that was not.
2024-12-07 09:28:44 -06:00