metricspi/alerts: Correct BURP archive alert query

When the RAID array is being resynchronized after the archived disk has
been reconnected, md changes the disk status from "missing" to "spare."
Once the synchronization is complete, it changes from "spare" to
"active."  We only want to trigger the "disk needs archived" alert once
the synchronization process is complete; otherwise, both the "disks need
swapped" and "disk needs archived" alerts would be active at the same
time, which makes no sense.  By adjusting the query for the "disk needs
archived" alert to consider disks in both "missing" and "spare" status,
we can delay firing that alert until the proper time.
step-ssh
Dustin 2023-06-20 11:58:35 -05:00
parent b05edbf7fb
commit 4e608e379f
1 changed files with 3 additions and 1 deletions

View File

@ -87,7 +87,9 @@ vmalert_rules:
date by a few weeks.
- alert: disk needs archived
expr:
collectd_md_md_disks{instance="burp1.pyrocufflink.blue", type="missing"} < 1
sum(
collectd_md_md_disks{instance="burp1.pyrocufflink.blue", type=~"missing|spare"}
) < 1
annotations:
summary: One of the disks in the BURP array should be archived
description: >-