1
0
Fork 0
Commit Graph

538 Commits (6919031b78412bb3927e0f33400439985f0398eb)

Author SHA1 Message Date
Dustin a5d186b461 sshca: Add update-machine-ids script
The `update-machine-ids.sh` shell script helps update the `sshca-data`
SealedSecret with the current contents of the `machine-ids.json` file
(stored locally, not tracked in Git).
2024-01-25 20:42:47 -06:00
Dustin 8ae8bad112 v-m: Scrape serial1.p.b 2024-01-25 20:42:07 -06:00
Dustin 7eae328a2c sshca: Add machine ID for serial1.p.b 2024-01-25 20:41:54 -06:00
Dustin 9fff21aae1 h-a: Remove roomba_is_downstairs template sensor
This sensor is now provided by a [Threshold][0] helper.

[0]: https://www.home-assistant.io/integrations/threshold/
2024-01-25 17:31:36 -06:00
Dustin 8bb8ed4402 xactfetch: Additional mounts for rbw sync
In order to sync the Bitwarden vault, `rbw` needs its configuration file
in `/etc/rbw` and access to writable ephemeral storage at `/tmp`.
2024-01-24 12:00:13 -06:00
Dustin ad37948fe2 v-m: Scrape all metrics components
We are now getting metrics from *vmstorage*, *vminsert*, *vmselect*,
*vmalert*, *alertmanaer*, and *blackbox-exporter*, in addition to
*vmagent*.
2024-01-23 11:51:50 -06:00
Dustin bcb588407d v-m: Correct vmalert remote read/write URLs
*vmalert* has been generating alerts and triggering notifications, but
not writing any `ALERTS`/`ALERTS_FOR_STATE` metrics.  It turns out this
is because I had not correctly configured the remote read/write
URLs.
2024-01-23 10:45:40 -06:00
Dustin 9a76a548ec argocd/app: jenkins: Enable auto sync
We're going to try out automatically synchronizing the Jenkins resources
when changes are pushed to Git.
2024-01-22 18:50:41 -06:00
Dustin 119a8a74ae v-m: alerts: Enhance Frigate unavailable alert
If Frigate is running but not connected to the MQTT broker, the
`sensor.frigate_status` entity will be available, but the
`update.frigate_server` entity will not.
2024-01-22 18:27:30 -06:00
Dustin 20ef2a287b jenkins: Update to 2.426.2 2024-01-22 18:01:03 -06:00
Dustin fb9ac66ad3 Merge remote-tracking branch 'refs/remotes/origin/master' 2024-01-22 17:55:53 -06:00
Dustin 0e20952740 xactfetch: Sync vault before running
The Bitwarden vault needs to be synced before *xactfetch* runs, in case
the password for a bank website has changed since it was first fetched.
2024-01-22 17:52:35 -06:00
Dustin 2f9d8ad618 jenkins: Add CA key to ssh_known_hosts
Since (almost) all managed hosts have SSH certificates signed by SSHCA
now, the need to maintain a pseudo-dynamic SSH key list is winding down.
If we include the SSH CA key in the global known hosts file, and
explicitly list the couple of hosts that do not have a certificate, we
can let Ansible use that instead of fetching the host keys on each run.
2024-01-22 17:52:35 -06:00
Dustin 3d55d7aafa keyserv: Add age key for NUT/dustin
This key is used to encrypt the password for the NUT user *dustin*,
which I use to manually control the UPS.
2024-01-22 17:52:35 -06:00
Dustin a7450a8af2 kitchen: Fix Jenkins deployment role
Since Jenkins jobs run in Kubernetes now, they can authenticate to the
Kubernetes API using a ServiceAccount and do not need a dedicated
User.
2024-01-22 17:00:50 -06:00
Dustin 990204b2cf kitchen: Use Certifi TLS CA bundle for OpenSSL
The MQTT client needs a trusted root CA bundle, which is not available
in the container image used by the *kitchen* server (it's based on
*pythonctnr* which literally *only* includes Python).  Fortunately, as
it uses OpenSSL under the hood, we can configure it to use the bundle
included with the *certifi* Python package via an environment variable.
2024-01-22 16:57:38 -06:00
Dustin 9b441738d4 dch-webhooks: Disable HTTPS redirect
The [Generic Event][0] plugin for Jenkins does not support HTTPS
webhooks, only plain HTTP.

[0]: https://plugins.jenkins.io/generic-event/
2024-01-22 16:55:03 -06:00
Dustin 54e7a25f93 v-m: vmstorage: Remove startup/ready probes
Kubernetes will not start additional Pods in a StatefulSet until the
existing ones are Ready.  This means that if there is a problem bringing
up, e.g. `vmstorage-0`, it will never start `vmstorage-1` or
`vmstorage-2`.  Since this pretty much defeats the purpose of having a
multi-node `vmstorage` cluster, we have to remove the readiness probe,
so the Pods will be Ready as soon as they start.  If there is a problem
with one of them, it will matter less, as the others can still run.
2024-01-22 16:43:46 -06:00
Dustin ca02dfec62 v-m: Add host labels to collectd-virt metrics
The *virt* plugin for *collectd* sets `instance` to the name of the
libvirt domain the metric refers to.  This makes it so there is no label
identifying which host the VM is running on.  Thus, if we want to
classify metrics by VM host, we need to add that label explicitly.

Since the `__address__` label is not available during metric relabeling,
we need to store it in a temporary label, which gets dropped at the end
of the relabeling phase.  We copy the value of that label into a new
label, but only for metrics that match the desired metric name.
2024-01-22 11:12:19 -06:00
Dustin 832dea2c7d h-a: Add init container to wait for PostgreSQL
When Home Assistant starts, if PostgreSQL is unavailable, it will come
up successfully, but without the history component.  It never tries
again to connect and enable the component.  This makes it difficult to
detect the problem and thus easy to overlook the missing functionality.
To avoid having extended periods of missing state history, we'll force
Home Assistant to wait for PostgreSQL to come up before starting.
2024-01-21 19:50:54 -06:00
Dustin 50beecf0a9 h-a: Increase startup probe failure threshold
Home Assistant can sometimes tke an unexpectedly long time to start up,
but it eventually does.
2024-01-21 19:32:35 -06:00
Dustin cb39b5a547 h-a: Update mobile apps notification group
Updating the notification group for the family's new mobile devices.
2024-01-21 19:30:50 -06:00
Dustin 534c4bfca0 keyserv: Deploy keyserv
`keyserv` is a little utility I wrote to dispense *age* keys to clients.
It uses SSH certificates for authentication.  If the client presents an
SSH certificate signed by a trusted key, the server will return all the
keys the principal(s) listed in the certificate are allowed to use.  The
response is encrypted with the public key from the certificate, so the
client must have access to the corresponding private key in order to
read the response.

I am currently using this server to provide keys for the new
configuration policy.  The keys herein are used to encrypt NUT monitor
passwords.
2024-01-19 22:08:25 -06:00
Dustin 897923a172 authelia: Bypass Authelia for Paperless-ngx API
The [Paperless Mobile][0] app for Android uses the Paperless-ngx API.

[0]: https://github.com/astubenbord/paperless-mobile/
2024-01-19 13:42:03 -06:00
Dustin 5f24ca0ad2 Merge branch 'rosalina/master' 2024-01-15 19:19:43 -06:00
Dustin 51775ede81 v-m/vmagent: Scrape nut0
*nut0.pyrocufflink.blue* is the new UPS monitor server.  It runs Fedora
CoreOS, with NUT in a container.
2024-01-15 18:46:46 -06:00
Dustin 90b293d5c8 v-m/vmagent: Scrape k8s-amd64-n3 2024-01-15 18:45:52 -06:00
Dustin 278be05121 v-m/blackbox: Switch to upstream container image
I found the official container image for Prometheus Blackbox exporter.
It is hosted on Quay, which is why I didn't see it on Docker Hub when I
looked initially.
2024-01-15 18:45:25 -06:00
Dustin 539e25d9bd v-m/vmagent: Scrape public clouds to test Internet
Scraping the public DNS servers doesn't work anymore since the firewall
routes traffic through Mullvad.  Pinging public cloud providers should
give a pretty decent indication of Internet connectivity.  It will also
serve as a benchmark for the local DNS performance, since the names will
have to be resolved.
2024-01-15 18:44:46 -06:00
Dustin 6496e76079 autoscaler: Update to CA 1.26
Cluster Autoscaler version is supposed to match the Kubernetes version.
Also, updating specifically to address ASG tags for node resources
([issue 5164]).

[issue 5164]: https://github.com/kubernetes/autoscaler/issues/5164
2024-01-14 11:33:30 -06:00
Dustin 89516ebf55 sshca: Add machine ID for nut0 2024-01-13 09:51:13 -06:00
Dustin 4cec66fc13 sshca: Add machine IDs for nvr1, k8s-aarch64-n1 2024-01-07 21:16:37 -06:00
Dustin fbf2a6864f cert-manager: cert-exporter: Static SSH host keys
The *cert-exporter* script really only needs the SSH host key for Gitea,
so the dynamic host key fetch is overkill.  Since it frequently breaks
for various reasons, it's probably better to just have a static list of
trusted keys.
2024-01-04 15:35:00 -06:00
Dustin 98cdcdfe30 v-m/scrape: Stable instance label for Longhorn
By default, the `instance` label for discovered metrics targets is set
to the scrape address.  For Kubernetes pods, that is the IP address and
port of the pod, which naturally changes every time the pod is recreated
or moved.  This will cause a high churn rate for Longhorn manager pods.
To avoid this, we set the `instance` label to the name of the node the
pod is running on, which will not change because the Longhorn manager
pods are managed by a DaemonSet.
2024-01-04 09:16:20 -06:00
Dustin ce3bc87f9e authelia: Reduce concent durations
After considering the implications of Authelia's pre-configured consent
feature, I decided I did not like the fact that a malicious program
could potentially take over my entire Kubernetes cluster without my
knowledge, since `kubectl` may not require any interaction, and could
therefore be executed without my knowledge.  I stopped ticking the
"Remember Consent" checkbox out of paranoia, but that's gotten kind of
annoying.  I figure a good compromise is to only prompt for consent a
couple of times per day.
2024-01-04 09:08:07 -06:00
Dustin ced5a7b4a1 websites: Host darkchestofwonders.us in k8s
The *darkchestofwonders.us* website is a legacy Python/mod_wsgi
application.  It was down for a while after updating the main web server
to Fedora 38.  Although we don't upload as many screenshots anymore, we
do still enjoy looking at the old ones.  Until I get a chance to either
update the site to use a more modern deplyoment mechansim, or move the
screenshots to some other photo hosting system, the easiest way to keep
it online is to run it in a container.
2024-01-04 08:56:12 -06:00
Dustin 0d68b25e5f rent-reminder: Add CronJob to send reminders
This CronJob sends scheduled rent reminders to Brandon.
2024-01-04 08:54:54 -06:00
Dustin bac7de72f2 v-m: Scrape Longhorn manager metrics
Each Longhorn manager pod exports metrics about the node on which it is
running.  Thus, we have to scrape every pod to get the metrics about the
whole ecosystem.
2024-01-02 11:27:31 -06:00
Dustin 225fd8469c v-m/vmagent: Allow listing all pods in cluster
The original RBAC configuration allowed `vmagent` only to list the pods
in the `victoria-metrics` namespace.  In order to allow it to monitor
other applications' pods, it needs to be assigned permission to list
pods in all namespaces.
2024-01-02 11:25:54 -06:00
Dustin 8f088fb6ae v-m: Deploy (clustered) Victoria Metrics
Since *mtrcs0.pyrocufflink.blue* (the Metrics Pi) seems to be dying,
I decided to move monitoring and alerting into Kubernetes.

I was originally planning to have a single, dedicated virtual machine
for Victoria Metrics and Grafana, similar to how the Metrics Pi was set
up, but running Fedora CoreOS instead of a custom Buildroot-based OS.
While I was working on the Ignition configuration for the VM, it
occurred to me that monitoring would be interrupted frequently, since
FCOS updates weekly and all updates require a reboot.  I would rather
not have that many gaps in the data.  Ultimately I decided that
deploying a cluster with Kubernetes would probably be more robust and
reliable, as updates can be performed without any downtime at all.

I chose not to use the Victoria Metrics Operator, but rather handle
the resource definitions myself.  Victoria Metrics components are not
particularly difficult to deploy, so the overhead of running the
operator and using its custom resources would not be worth the minor
convenience it provides.
2024-01-01 17:48:10 -06:00
Dustin 8c605d0f9f home-assistant: Clean up restart_diddy_mopidy
Moving the shell command to an external script allows me to update it
without having to restart Home Assistant.

Including the SSH private key in the Secret not only allows it to be
managed by Kubernetes, but also works around a permissions issue when
storing the key in the `/config` volume.  The `ssh` command refuses to
use a key file with write permission for the group or other fields, but
the Kubelet sets `g=rw` when `fsGroup` is set on the pod.
2023-12-28 17:34:25 -06:00
Dustin b9d48d0df8 home-assistant: Add (back) event-snapshot.sh
When transitioning to the ConfigMap for maintaining Home Assistant YAML
configuration, I did not bring the `event-snapshot.sh` script because I
thought it was no longer in use.  It turns out I was mistaken; it is
used by the driveway camera alerts.
2023-12-28 17:09:01 -06:00
Dustin ad65a12b66 jenkins: Allow Jenkins to read pod logs
Jenkins needs permission to read pod logs so it can display output from
the JNLP agent if it crashes.
2023-12-27 15:33:36 -06:00
Dustin 4c6962fbc8 fuse-device-plugin: Run on Raspberry Pi nodes
The FUSE device plugin needs to run on the Raspbperry Pi nodes in order
to build aarch64 container images in Jenkins.
2023-12-27 15:32:28 -06:00
Dustin e56526600d home-assistant: Manage YAML files with ConfigMap
Editing `configuration.yaml` et al. using `vi` via `kubectl exec` is
rather tedious, since the version of `vi` in the *home-assistant*
container image is very rudimentary.  Thus, I think it would be better
to use a ConfigMap to store the manually-edited YAML files, so I can
edit them with my regular editor on my desktop.  For this to work, the
ConfigMap has to be mounted as a directory rather than as individual
files (using `subPath`), as otherwise the pod would have to be restarted
every time one of the files is updated.
2023-12-27 15:31:30 -06:00
Dustin 8d796a7c01 authelia: Fix argocd-cli OIDC client
The `argocd` CLI needs the audience claim in OIDC identity tokens to be
`argocd-cli` or it will refuse to use the token.
2023-12-27 15:30:31 -06:00
Dustin 12773c7fd2 authelia: Restrict access to paperless-ngx
Since all Paperless-ngx users see the same content, we should restrict
who can log in.
2023-12-27 15:29:46 -06:00
Dustin 39d19cb3ea authelia: Restrict access to firefly
Since we've configured the Ingress for Firefly III to log everyone in as
*dustin* via a faked `Remote-User` request header, any user on the
Pyrocufflink domain would be able to see my finances.  Using Authelia's
access control mechanism, we can restrict this to only users in a
specific group.
2023-12-27 15:27:44 -06:00
Dustin 9561c687aa xactfetch: Run xactfetch in a CronJob
I finally got *xactfetch* cleaned up enough to run in a headless
container.
2023-12-27 11:08:25 -06:00
Dustin a235fbd5ac firefly-iii: Use a single Data Importer instance
Tabitha has decided not to use Firefly to manage her finances.  We've
mostly consolidated our expenses and income now, which I manage in my
Firefly account.  In fact, the Ingress for Firefly III itself always
sets the `Remote-User: dustin` header, so only my account is accessible
anyway.  Thus, there is no longer any reason to have two Data Importer
instances.
2023-12-10 08:55:20 -06:00