Fedora CoreOS fills `/boot` beyond the 75% alert threshold under normal circumstances on aarch64 machines. This is not a problem, because it cleans up old files on its own, so we do not need to alert on it. Unfortunately, the _DiskUsage_ alert is already quite complex, and adding in exclusions for these devices would make it even worse. To simplify the logic, we can use a recording rule to precomupte the used/free space ratio. By using `sum(...) without (type)` instead of `sum(...) on (df, instance)`, we keep the other labels, which we can then use to identify the metrics coming from machines we don't care to monitor. Instead of having different thresholds for different volumes encoded in the same expression, we can use multiple alerts to alert on "low" vs "very low" thresholds. Since this will of course cause duplicate alerts for most volumes, we can use AlertManager inhibition rules to disable the "low" alert once the metric crosses the "very low" threshold. |
||
---|---|---|
argocd | ||
authelia | ||
autoscaler | ||
cert-manager | ||
collectd | ||
dch-root-ca | ||
dch-webhooks | ||
device-plugins | ||
docker-distribution | ||
dynk8s-provisioner | ||
firefly-iii | ||
fleetlock | ||
grafana | ||
home-assistant | ||
hudctrl | ||
ingress | ||
invoice-ninja | ||
jenkins | ||
keyserv | ||
kitchen | ||
loki-ca | ||
metrics | ||
ntfy | ||
paperless-ngx | ||
photoframesvc | ||
phpipam | ||
postgresql | ||
prometheus_speedtest | ||
promtail | ||
rabbitmq | ||
rent-reminder | ||
restic-exporter | ||
scanservjs | ||
sealed-secrets | ||
setup | ||
sshca | ||
step-ca | ||
storage | ||
updatebot | ||
victoria-metrics | ||
websites | ||
xactfetch | ||
xactmon | ||
README.md |
README.md
Dustin's Kubernetes Cluster
This repository contains resources for deploying and managing my on-premises Kubernetes cluster
Cluster Setup
The cluster primarily consists of libvirt/QEMU+KVM virtual machines. The Control Plane nodes are VMs, as are the x86_64 worker nodes. Eventually, I would like to add Raspberry Pi or Pine64 machines as aarch64 nodes.
All machines run Fedora, using only Fedora builds of the Kubernetes components
(kubeadm
, kubectl
, and kubeadm
).
See Cluster Setup for details.
Jenkins Agents
One of the main use cases for the Kubernetes cluster is to provide dynamic agents for Jenkins. Using the Kubernetes Plugin, Jenkins will automatically launch worker nodes as Kubernetes pods.
See Jenkins Kubernetes Integration for details.
Persistent Storage
Persistent storage for pods is provided by Longhorn. Longhorn runs within the cluster and provisions storage on worker nodes to make available to pods over iSCSI.
See Persistent Storage Using Longorn for details.