v-m: alerts: Add alert for temperatures
After the incident this week with the CPU overheating on _vmhost1_, I want to make sure I know as soon as possible when anything is starting to get too hot.
This commit is contained in:
@@ -141,3 +141,10 @@ groups:
|
||||
- ignoring (instance) group_right (scope) (patroni_xlog_replayed_location != 0)
|
||||
> 10240
|
||||
for: 10m
|
||||
|
||||
- name: Temperature
|
||||
rules:
|
||||
- alert: High Temperature
|
||||
expr: >-
|
||||
{__name__=~"collectd_.*_temperature", sensors!~"i350bb.*"} > 80
|
||||
for: 10m
|
||||
|
||||
Reference in New Issue
Block a user