kubernetes

infra

Author	SHA1	Message	Date
Dustin	71b52e4c6f	20125: Deploy Status server https://20125.home/ is the URL the Status Android application loads in its main WebView. This site is powered by a server that generates a custom page showing the status of our self-hosted applications, based on alerts retrieved from the AlertManager API. Android WebView does not allow cleartext HTTP connections. It does, however, allow connecting an HTTPS server and ignoring the certificate it presents, which is effectively the same thing. Thus, we generate a self-signed certificate for the Ingress for this site.	2024-11-02 19:51:53 -05:00
Dustin	8ecee4133f	v-m/alerts: Rework free disk space alert Fedora CoreOS fills `/boot` beyond the 75% alert threshold under normal circumstances on aarch64 machines. This is not a problem, because it cleans up old files on its own, so we do not need to alert on it. Unfortunately, the _DiskUsage_ alert is already quite complex, and adding in exclusions for these devices would make it even worse. To simplify the logic, we can use a recording rule to precomupte the used/free space ratio. By using `sum(...) without (type)` instead of `sum(...) on (df, instance)`, we keep the other labels, which we can then use to identify the metrics coming from machines we don't care to monitor. Instead of having different thresholds for different volumes encoded in the same expression, we can use multiple alerts to alert on "low" vs "very low" thresholds. Since this will of course cause duplicate alerts for most volumes, we can use AlertManager inhibition rules to disable the "low" alert once the metric crosses the "very low" threshold.	2024-11-02 09:38:02 -05:00
Dustin	4cef41688f	v-m/alerts: Add Zigbee+ZWave network alerts	2024-11-01 18:14:56 -05:00
Dustin	6cf11f9f61	v-m: Scrape HAProxy	2024-11-01 18:14:37 -05:00
Dustin	7a768cbb76	v-m: Update jobs for new Loki server loki1.pyrocufflink.blue is a regular Fedora machine, a member of the AD domain, and managed by Ansible. Thus, it does not need to be explicitly listed as a scrape target. For scraping metrics from Loki itself, I've changed the job to use DNS-SD because it seems like `vmagent` does _not_ re-resolve host names from static configuration.	2024-11-01 18:07:34 -05:00
Dustin	0101040634	v-m/alerts: Add Paperless-ngx email task alert This alert should fire if the background task to fetch e-mail and import them into Paperless-ngx has not run for a while.	2024-11-01 18:04:06 -05:00
Dustin	3f9601dc94	v-m/alerts: Improve Paperless-ngx Celery task alert The `flower_events_total` metric is a counter, so its value only ever increases (discounting restarts of the server process). As such, nonzero values do not necessarily indicate a _current_ problem, but rather that there was one at some point in the past. To identify current issues, we need to use the `increase` function, and then apply the `max_over_time` function so that the alert doesn't immediately reset itself.	2024-11-01 18:00:50 -05:00
Dustin	d12e66f58a	v-m: Scrape Frigate exporter	2024-11-01 17:47:51 -05:00
Dustin	045eea89a9	Merge remote-tracking branch 'refs/remotes/origin/master'	2024-10-19 09:49:59 -05:00
Dustin	8ff45a8c01	paperless-ngx/gotenberg: Run as correct user The Gotenberg container image uses UID 1001 for the _gotenberg_ user. Using any other UID number, even when the home directory is set and owned by that UID, results in random issues, especially when using LibreOffice conversions.	2024-10-19 09:46:15 -05:00
giteadmin	d3e00680c0	Merge pull request 'home-assistant: Update to 2024.10.3' (#29 ) from updatebot/home-assistant into master Reviewed-on: #29	2024-10-19 13:13:12 +00:00
bot	c5daf23f71	mosquitto: Update to 2.0.20	2024-10-19 11:32:16 +00:00
bot	6e2c8d1a25	zwavejs2mqtt: Update to 9.24.0	2024-10-19 11:32:16 +00:00
bot	0e3f719e32	whisper: Update to 2.2.0	2024-10-19 11:32:16 +00:00
bot	94e10207d2	home-assistant: Update to 2024.10.3	2024-10-19 11:32:15 +00:00
Dustin	99c8f7694c	paperless-ngx: Split resources into separate files The Paperless-ngx ecosystem consists of several services. Defining the resources for each service in separate manifest files will make maintenance a little bit easier.	2024-10-17 07:27:33 -05:00
Dustin	e19e8f50ab	v-m/alerts: Add alerts for Paperless-ngx	2024-10-17 07:18:23 -05:00
Dustin	78651eb5f8	v-m/alerts: Add alerts for PostgreSQL WAL archiver	2024-10-17 07:18:09 -05:00
Dustin	ee3e078b20	v-m/alerts: Add alerts for Restic backups	2024-10-17 06:58:48 -05:00
Dustin	ea89e0cde4	v-m/scrape: Remove synapse job The Synapse server is now completely decommissioned.	2024-10-17 06:50:27 -05:00
Dustin	e581957f9d	Merge remote-tracking branch 'refs/remotes/origin/master'	2024-10-15 07:59:42 -05:00
Dustin	b01300f8cc	Merge pull request 'zwavejs2mqtt: Update to 9.20.0' (#26 ) from updatebot/home-assistant into master Reviewed-on: #26	2024-10-15 12:43:28 +00:00
bot	55ae979a1d	mosquitto: Update to 2.0.19	2024-10-15 12:42:36 +00:00
bot	1de05f2ccc	zwavejs2mqtt: Update to 9.23.0	2024-10-15 12:42:36 +00:00
bot	58f7f9e2cc	zigbee2mqtt: Update to 1.40.2	2024-10-15 12:42:35 +00:00
bot	390eacf209	home-assistant: Update to 2024.10.2	2024-10-15 12:42:35 +00:00
Dustin	145fa6286e	storage: Add Longhorn backup target secret Longhorn uses a special Secret resource to configure the backup target. This secret includes the credentials and CA certificate for accessing the MinIO S3 service. Longhorn must be configured to use this Secret by setting the `backup-target-credential-secret` setting to `minio-backups-credentials`.	2024-10-13 14:03:49 -05:00
Dustin	1b4bb234c8	Merge pull request 'gotenberg: Update to 8.10.0' (#25 ) from updatebot/paperless-ngx into master Reviewed-on: #25	2024-10-12 20:44:58 +00:00
Dustin	7e2512c261	Merge pull request 'authelia: Update to 4.38.12' (#28 ) from updatebot/authelia into master Reviewed-on: #28	2024-10-12 20:44:41 +00:00
bot	281ec623c4	authelia: Update to 4.38.16	2024-10-12 11:33:03 +00:00
bot	51fe6f39af	gotenberg: Update to 8.12.0	2024-10-12 11:33:00 +00:00
Dustin	2ccbcd494c	firefly-iii: Update to 6.1.21 Notably, this version fixes the ~4s delay when creating/editing transactions.	2024-10-02 09:08:58 -05:00
Dustin	e9bfc63a74	Merge remote-tracking branch 'refs/remotes/origin/master'	2024-10-02 09:08:31 -05:00
Dustin	32171cc76e	Merge pull request 'firefly-iii: Update to 6.1.20' (#27 ) from updatebot/firefly-iii into master Reviewed-on: #27	2024-09-29 21:09:41 +00:00
bot	71f091fa05	firefly-iii: Update to 6.1.20	2024-09-28 11:32:18 +00:00
Dustin	df50decba1	argocd: apps/authelia: Enable auto-sync This way, merging PRs from updatebot will automatically trigger updating Paperless-ngx et al.	2024-09-24 07:16:45 -05:00
Dustin	0022171616	argocd: apps/ntfy: Enable auto-sync This way, merging PRs from updatebot will automatically trigger updating Paperless-ngx et al.	2024-09-24 07:16:34 -05:00
Dustin	a149bc8761	updatebot: Manage Authelia	2024-09-24 07:15:41 -05:00
Dustin	76588c3e20	updatebot: Manage Mosquitto	2024-09-24 07:08:56 -05:00
Dustin	bdc24e1778	updatebot: Manage ntfy	2024-09-24 07:05:37 -05:00
Dustin	982cd88255	Merge remote-tracking branch 'refs/remotes/origin/master'	2024-09-22 12:13:58 -05:00
Dustin	ffa47b9fba	v-m: Scrape ntfy _ntfy_ has supported Prometheus metrics for a while now, so let's collect them.	2024-09-22 12:13:01 -05:00
Dustin	9ec6b651c1	v-m: Scrape wal-g via statsd_exporter The database server now runs _statsd_exporter_, which receives metrics from WAL-G whenever it saves WAL segments or creates backups.	2024-09-22 12:11:59 -05:00
Dustin	c83ceee994	v-m: Quit scraping Jenkins with blackbox_exporter I was doing this to monitor Jenkins's certificate, but since that's managed by _cert-manager_, there's really practically no risk of it expiring without warning anymore. Since Jenkins is already being scraped directly, having this extra check just gernerates extra notifications when there is an issue without adding any real value.	2024-09-22 12:10:03 -05:00
Dustin	3f39747557	v-m: Redo Internet/DNS connectivity checks (again) Using domain names in the "blackbox" probe makes it difficult to tell the difference between a complete Internet outage and DNS issues. I switched to using these names when I changed how the firewall routed traffic to the public DNS servers, since those were the IP addresses I was using to determine if the Internet was "up." I think it makes sense, though, to just ping the upstream gateway for that check. If EverFast changes their routing or numbering, we'll just have to update our checks to match.	2024-09-22 12:06:03 -05:00
Dustin	8f354a4460	v-m/alertmanager: Suppress battery low alerts The alerts for Z-Wave device batteries in particular are pretty annoying, as they tend to "flap" for some reason. I like having the alerts show up on Alertmanager/Grafana dashboards, but I don't necessarily need notifications about them. Fortunately, we can create a special "none" receiver and route notifications there, which does exactly what we want here.	2024-09-22 12:01:02 -05:00
Dustin	1c6286a977	ntfy: Migrate to Kustomize Using Kustomize, we can define the configuration file separately from the Kubernetes resources, and use `configMapGenerators` to generate the ConfigMap for it. Additionally, this will make it possible to update _ntfy_ using `updatebot`.	2024-09-22 12:00:28 -05:00
Dustin	a6683c9123	invoice-ninja: Move under pyrocufflink.net Tabitha wants to be able to accept Apple Pay payemnts via stripe, but this requires an additional "domain verification" step. Apple needs to make an HTTP request to the domain owned by the vendor, which in the case of Invoice Ninja, must be the "app URL." Unfortunately, there does not appear to be a way to tell Apple/Stripe/IN to use the client portal domain or any other domain besides the app URL. Therefore, we need to expose Invoice Ninja to the Internet under the public _pyrocufflink.net_ domain, rather than the internal _pyrocufflink.blue_.	2024-09-22 11:55:10 -05:00
Dustin	f5b79cfdf8	updatebot: Schedule updats on Saturday morning Let's run `updatebot` on Saturday morning, so I can apply the changes over the weekend if I have time. If I don't, there's no harm in having the PRs open for a few days until I can get to it during the week.	2024-09-22 11:53:52 -05:00
Dustin	4cab489534	Merge pull request 'home-assistant: Update to 2024.9.2' (#24 ) from updatebot/home-assistant into master Reviewed-on: #24	2024-09-22 15:48:47 +00:00

... 2 3 4 5 6 ...

547 Commits (9b1a5ef14f58011d4b8164d48d247af311cd0806) All Branches Search

547 Commits (9b1a5ef14f58011d4b8164d48d247af311cd0806)

All Branches