From 4e608e379f17cad8cc3b33377f9ee07ab5ff13d9 Mon Sep 17 00:00:00 2001 From: "Dustin C. Hatch" Date: Tue, 20 Jun 2023 11:58:35 -0500 Subject: [PATCH] metricspi/alerts: Correct BURP archive alert query When the RAID array is being resynchronized after the archived disk has been reconnected, md changes the disk status from "missing" to "spare." Once the synchronization is complete, it changes from "spare" to "active." We only want to trigger the "disk needs archived" alert once the synchronization process is complete; otherwise, both the "disks need swapped" and "disk needs archived" alerts would be active at the same time, which makes no sense. By adjusting the query for the "disk needs archived" alert to consider disks in both "missing" and "spare" status, we can delay firing that alert until the proper time. --- group_vars/metricspi/alerts.yml | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/group_vars/metricspi/alerts.yml b/group_vars/metricspi/alerts.yml index 9e57768..ac6faf1 100644 --- a/group_vars/metricspi/alerts.yml +++ b/group_vars/metricspi/alerts.yml @@ -87,7 +87,9 @@ vmalert_rules: date by a few weeks. - alert: disk needs archived expr: - collectd_md_md_disks{instance="burp1.pyrocufflink.blue", type="missing"} < 1 + sum( + collectd_md_md_disks{instance="burp1.pyrocufflink.blue", type=~"missing|spare"} + ) < 1 annotations: summary: One of the disks in the BURP array should be archived description: >-