1
0
Fork 0
Commit Graph

1 Commits (ac62a77c96b72cc80c35d59f80089f0e29d0bb6f)

Author SHA1 Message Date
Dustin f7f408ca8c v-m: Redo vmstorage persistent volumes
Longhorn does not work well for very large volumes.  It takes ages to
synchronize/rebuild them when migrating between nodes, which happens
all too frequently.  This consumes a lot of resources, which impacts
the operation of the rest of the cluster, and can cause a cascading
failure in some circumstances.

Now that the cluster is set up to be able to mount storage directly from
the Synology, it makes sense to move the Victoria Metrics data there as
well.  Similar to how I did this with Jenkins, I created
PersistentVolume resources that map to iSCSI volumes, and patched the
PersistentVolumeClaims (or rather the template for them defined by the
StatefulSet) to use these.  Each `vmstorage` pod then gets an iSCSI
LUN, bypassing both Longhorn and QEMU to write directly to the NAS.

The migration process was relatively straightforwrad.  I started by
scaling down the `vminsert` Deployment so the `vmagent` pods would
queue the metrics they had collected while the storage layer was down.
Next, I created a [native][0] export of all the time series in the
database.  Then, I deleted the `vmstorage` StatefulSet and its
associated PVCs.  Finally, I applied the updated configuration,
including the new PVs and patched PVCs, and brought the `vminsert`
pods back online.  Once everything was up and running, I re-imported
the exported data.

[0]: https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html#how-to-export-data-in-native-format
2024-06-26 18:29:49 -05:00