diff --git a/group_vars/metricspi/alerts.yml b/group_vars/metricspi/alerts.yml index fdac376..9e57768 100644 --- a/group_vars/metricspi/alerts.yml +++ b/group_vars/metricspi/alerts.yml @@ -85,6 +85,17 @@ vmalert_rules: something happens to the active disk, such as hardware failure, power surge, fire, or accidental `rm -rf`, the offline disk is only out of date by a few weeks. + - alert: disk needs archived + expr: + collectd_md_md_disks{instance="burp1.pyrocufflink.blue", type="missing"} < 1 + annotations: + summary: One of the disks in the BURP array should be archived + description: >- + The disks in the BURP RAID-1 (mirror) array should be swapped + periodically. One disk should be online and mounted while the other + is stored in the fireproof safe. All of the disks are currently + online; one needs to be disconnected and moved to the safe as soon as + possible. - name: certificates rules: