v-m/alerts: Improve Paperless-ngx Celery task alert
The `flower_events_total` metric is a counter, so its value only ever increases (discounting restarts of the server process). As such, nonzero values do not necessarily indicate a _current_ problem, but rather that there was one at some point in the past. To identify current issues, we need to use the `increase` function, and then apply the `max_over_time` function so that the alert doesn't immediately reset itself.pull/32/head
parent
d12e66f58a
commit
3f9601dc94
|
@ -172,9 +172,13 @@ groups:
|
||||||
rules:
|
rules:
|
||||||
- alert: Celery tasks failed
|
- alert: Celery tasks failed
|
||||||
expr: >-
|
expr: >-
|
||||||
flower_events_total{job="paperless-ngx", type="task-failed"} > 0
|
max_over_time(
|
||||||
|
increase(
|
||||||
|
flower_events_total{job="paperless-ngx", type="task-failed"}
|
||||||
|
)[24h]
|
||||||
|
) > 0
|
||||||
annotations:
|
annotations:
|
||||||
summary: One or more Celery tasks have failed
|
summary: Paperless-ngx Celery task failed
|
||||||
description: >-
|
description: >-
|
||||||
Failing Celery tasks may indicate a problem with the Paperless-ngx
|
Failing Celery tasks may indicate a problem with the Paperless-ngx
|
||||||
deployment and can result in data loss. Check the Paperless-ngx logs
|
deployment and can result in data loss. Check the Paperless-ngx logs
|
||||||
|
|
Loading…
Reference in New Issue