1
0
Fork 0

Compare commits

..

8 Commits

Author SHA1 Message Date
bot 07c48d61d7 tika: Update to 3.0.0.0 2024-11-02 11:32:17 +00:00
bot f5fe98a397 paperless-ngx: Update to 2.13.2 2024-11-02 11:32:17 +00:00
Dustin 4cef41688f v-m/alerts: Add Zigbee+ZWave network alerts 2024-11-01 18:14:56 -05:00
Dustin 6cf11f9f61 v-m: Scrape HAProxy 2024-11-01 18:14:37 -05:00
Dustin 7a768cbb76 v-m: Update jobs for new Loki server
*loki1.pyrocufflink.blue* is a regular Fedora machine, a member of the
AD domain, and managed by Ansible.  Thus, it does not need to be
explicitly listed as a scrape target.

For scraping metrics from Loki itself, I've changed the job to use
DNS-SD because it seems like `vmagent` does _not_ re-resolve host names
from static configuration.
2024-11-01 18:07:34 -05:00
Dustin 0101040634 v-m/alerts: Add Paperless-ngx email task alert
This alert should fire if the background task to fetch e-mail and import
them into Paperless-ngx has not run for a while.
2024-11-01 18:04:06 -05:00
Dustin 3f9601dc94 v-m/alerts: Improve Paperless-ngx Celery task alert
The `flower_events_total` metric is a counter, so its value only ever
increases (discounting restarts of the server process).  As such,
nonzero values do not necessarily indicate a _current_ problem, but
rather that there was one at some point in the past.  To identify
current issues, we need to use the `increase` function, and then apply
the `max_over_time` function so that the alert doesn't immediately reset
itself.
2024-11-01 18:00:50 -05:00
Dustin d12e66f58a v-m: Scrape Frigate exporter 2024-11-01 17:47:51 -05:00
3 changed files with 87 additions and 10 deletions

View File

@ -45,7 +45,7 @@ patches:
images:
- name: ghcr.io/paperless-ngx/paperless-ngx
newTag: 2.13.0
newTag: 2.13.2
- name: docker.io/gotenberg/gotenberg
newTag: 8.12.0
- name: docker.io/apache/tika

View File

@ -68,18 +68,48 @@ groups:
rules:
- alert: Frigate is Unavailable
expr:
homeassistant_entity_available{entity=~".*frigate_(server|status)"} != 1
absent(frigate_service_info)
or irate(frigate_service_last_updated_timestamp) < 1
or irate(frigate_service_uptime_seconds) < 1
for: 10m
- alert: Camera unavailable
expr:
homeassistant_entity_available{domain="camera"} != 1
for: 10m
- name: Sensors
- name: Home Assistant
rules:
- alert: Battery Low
expr:
homeassistant_sensor_battery_percent{entity!~"sensor\\.(pixel_|sm_p610).*"} < 10
annotations:
summary: >-
Low battery: {{ $labels.friendly_name }}
severity: minor
- alert: Z-Wave Network is Offline
expr:
sum(
homeassistant_entity_available{entity="sensor.usb_controller_status"}
) without (
friendly_name
) < 1
annotations:
summary: The Z-Wave network controller is offline
description: >-
Home Assistant is not able to communicate with ZWaveJS, or ZWaveJS is
not able to connect to the Z-Wave USB controller. Z-Wave devices like
light switches, door sensors, and smart plugs will not work until the
Z-Wave network is operational again.
- alert: Zigbee Network is Offline
expr:
homeassistant_binary_sensor_state{entity="binary_sensor.zigbee2mqtt_bridge_connection_state"} == 0
annotations:
summary: The Zigbee network bridge is offline
description: >-
Home Assistant is not able to communicate with Zigbee2MQTT, or
Zigbee2MQTT is not able to connect to the Z-Wave USB controller.
Zigbee devices like smart bulbs and buttons will not work until the
Zigbee network is operational again.
- name: PostgreSQL
rules:
@ -170,10 +200,28 @@ groups:
rules:
- alert: Celery tasks failed
expr: >-
flower_events_total{job="paperless-ngx", type="task-failed"} > 0
max_over_time(
increase(
flower_events_total{job="paperless-ngx", type="task-failed"}
)[24h]
) > 0
annotations:
summary: One or more Celery tasks have failed
summary: Paperless-ngx Celery task failed
description: >-
Failing Celery tasks may indicate a problem with the Paperless-ngx
deployment and can result in data loss. Check the Paperless-ngx logs
for details about the task failures.
- alert: Paperless email task not running
expr: >-
absent(
flower_events_total{
type="task-started",
task="paperless_mail.tasks.process_mail_accounts"
}
)
annotations:
summary: Paperless task to process mail accounts has not run recently
description: >-
Paperless-ngx uses a scheduled Celery task to periodically poll email
mailboxes for new messages. If this task does not start, new email
messages will not be downloaded and imported into the document library.

View File

@ -76,7 +76,6 @@ scrape_configs:
static_configs:
- targets:
- gw1.pyrocufflink.blue
- loki0.pyrocufflink.blue
- nut0.pyrocufflink.blue
- nvr2.pyrocufflink.blue
- unifi3.pyrocufflink.blue
@ -251,7 +250,6 @@ scrape_configs:
metrics_path: /bridge?selector=zincati
static_configs:
- targets:
- loki0.pyrocufflink.blue
- nut0.pyrocufflink.blue
- unifi3.pyrocufflink.blue
kubernetes_sd_configs:
@ -279,14 +277,21 @@ scrape_configs:
scheme: https
tls_config:
ca_file: /run/dch-ca/dch-root-ca.crt
static_configs:
- targets:
dns_sd_configs:
- names:
- loki.pyrocufflink.blue
type: A
port: 443
relabel_configs:
- source_labels: [__meta_dns_name, __meta_dns_srv_record_port]
separator: ':'
target_label: __address__
- source_labels: [__address__]
target_label: instance
- job_name: promtail
static_configs:
- targets:
- loki0.pyrocufflink.blue
- nut0.pyrocufflink.blue
- nvr2.pyrocufflink.blue
- unifi3.pyrocufflink.blue
@ -456,3 +461,27 @@ scrape_configs:
- source_labels:
- __meta_kubernetes_pod_name
target_label: instance
- job_name: frigate
dns_sd_configs:
- names:
- frigate.pyrocufflink.blue
type: A
port: 9100
relabel_configs:
- source_labels: [__meta_dns_name, __meta_dns_srv_record_port]
separator: ':'
target_label: __address__
- source_labels: [__address__]
target_label: instance
- job_name: haproxy
static_configs:
- targets:
- haproxy0.pyrocufflink.blue
relabel_configs:
- source_labels: [__address__]
target_label: instance
- source_labels: [__address__]
target_label: __address__
replacement: '$1:8118'