v-m: Deploy (clustered) Victoria Metrics

Since *mtrcs0.pyrocufflink.blue* (the Metrics Pi) seems to be dying,
I decided to move monitoring and alerting into Kubernetes.

I was originally planning to have a single, dedicated virtual machine
for Victoria Metrics and Grafana, similar to how the Metrics Pi was set
up, but running Fedora CoreOS instead of a custom Buildroot-based OS.
While I was working on the Ignition configuration for the VM, it
occurred to me that monitoring would be interrupted frequently, since
FCOS updates weekly and all updates require a reboot.  I would rather
not have that many gaps in the data.  Ultimately I decided that
deploying a cluster with Kubernetes would probably be more robust and
reliable, as updates can be performed without any downtime at all.

I chose not to use the Victoria Metrics Operator, but rather handle
the resource definitions myself.  Victoria Metrics components are not
particularly difficult to deploy, so the overhead of running the
operator and using its custom resources would not be worth the minor
convenience it provides.
This commit is contained in:
2024-01-01 15:23:14 -06:00
parent 8c605d0f9f
commit 8f088fb6ae
17 changed files with 1474 additions and 0 deletions

View File

@@ -0,0 +1,185 @@
apiVersion: v1
kind: ServiceAccount
metadata:
name: vmagent
labels:
app.kubernetes.io/name: vmagent
app.kubernetes.io/component: vmagent
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: vmagent
labels:
app.kubernetes.io/name: vmagent
app.kubernetes.io/component: vmagent
rules:
- apiGroups:
- ''
resources:
- nodes
verbs:
- get
- list
- watch
- apiGroups:
- ''
resources:
- nodes/proxy
verbs:
- get
- nonResourceURLs:
- /metrics
verbs:
- get
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: vmagent
labels:
app.kubernetes.io/name: vmagent
app.kubernetes.io/component: vmagent
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: vmagent
subjects:
- kind: ServiceAccount
name: vmagent
namespace: victoria-metrics
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: vmagent
labels:
app.kubernetes.io/name: vmagent
app.kubernetes.io/component: vmagent
rules:
- apiGroups:
- ''
resources:
- pods
verbs:
- get
- list
- watch
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: vmagent
labels:
app.kubernetes.io/name: vmagent
app.kubernetes.io/component: vmagent
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: Role
name: vmagent
subjects:
- kind: ServiceAccount
name: vmagent
---
apiVersion: v1
kind: Service
metadata:
name: vmagent
labels:
app.kubernetes.io/name: vmagent
app.kubernetes.io/component: vmagent
spec:
ports:
- port: 8429
name: vmagent
selector:
app.kubernetes.io/name: vmagent
app.kubernetes.io/component: vmagent
clusterIP: None
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: vmagent
labels:
app.kubernetes.io/name: vmagent
app.kubernetes.io/component: vmagent
spec:
serviceName: vmagent
selector:
matchLabels:
app.kubernetes.io/name: vmagent
app.kubernetes.io/component: vmagent
template:
metadata:
labels:
app.kubernetes.io/name: vmagent
app.kubernetes.io/component: vmagent
spec:
containers:
- name: vmagent
image: docker.io/victoriametrics/vmagent:v1.96.0
args:
- -envflag.enable=true
- -envflag.prefix=vmagent_
- -remoteWrite.tmpDataPath=/data
- -httpListenAddr=0.0.0.0:8429
- -promscrape.config=/config/scrape.yml
- -promscrape.configCheckInterval=30s
env:
- name: vmagent_remoteWrite_url
value: http://vminsert:8480/insert/1/prometheus/api/v1/write
ports:
- containerPort: 8429
name: http
readinessProbe: &probe
httpGet:
port: http
path: /health
periodSeconds: 60
startupProbe:
<<: *probe
periodSeconds: 1
successThreshold: 1
failureThreshold: 30
timeoutSeconds: 1
securityContext:
runAsNonRoot: true
readOnlyRootFilesystem: true
volumeMounts:
- mountPath: /config
name: config
readOnly: true
- mountPath: /data
name: tmpdata
subPath: data
serviceAccountName: vmagent
securityContext:
fsGroup: 2093
runAsGroup: 2093
runAsNonRoot: true
runAsUser: 2093
volumes:
- name: config
configMap:
name: vmagent
volumeClaimTemplates:
- apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: tmpdata
labels:
app.kubernetes.io/name: vmagent
app.kubernetes.io/component: vmagent
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 4G