1
0
Fork 0
Commit Graph

2 Commits (9977bb3de4b12bc82871f6a0cad19ec5d5bb63fe)

Author SHA1 Message Date
Dustin 98651cf9d9 jenkins: Force iSCSI volume on specific nodes
Instead of routing iSCSI traffic from the Kubernetes network, through
the firewall, to the storage network, nodes now have a second network
adapter connected to directly to the storage network.  The nodes with
such an adapter are labelled `network.du5t1n.me/storage`, so we can pin
the Jenkins PersistentVolume to them via a node affinity rule.
2024-06-26 18:29:49 -05:00
Dustin 7f3287297b jenkins: Migrate to iSCSI persistent volume
Managing the Jenkins volume with Longhorn has become increasingly
problematic.  Because of its large size, whenever Longhorn needs to
rebuild/replicate it (which happens often for no apparent reason), it
can take several hours.  While the synchronization is happening, the
entire cluster suffers from degraded performance.

Instead of using Longhorn, I've decided to try storing the data directly
on the Synology NAS and expose it to Kubernetes via iSCSI.  The Synology
offers many of the same features as Longhorn, including
snapshots/rollbacks and backups.  Using the NAS allows the volume to be
available to any Kubernetes node, without keeping multiple copies of
the data.

In order to expose the iSCSI service on the NAS to the Kubernetes nodes,
I had to make the storage VLAN routable.  I kept it as IPv6-only,
though, as an extra precaution against unauthorized access.  The
firewall only allows nodes on the Kubernetes network to access the NAS
via iSCSI.

I originally tried proxying the iSCSI connection via the VM hosts,
however, this failed because of how iSCSI target discovery works.  The
provided "target host" is really only used to identify available LUNs;
follow-up communication is done with the IP address returned by the
discovery process.  Since the NAS would return its IP address, which
differed from the proxy address, the connection would fail.  Thus, I
resorted to reconfiguring the storage network and connecting directly
to the NAS.

To migrate the contents of the volume, I temporarily created a PVC with
a different name and bound it to the iSCSI PersistentVolume.  Using a
pod with both the original PVC and the new PVC mounted, I used `rsync`
to copy the data.  Once the copy completed, I deleted the Pod and both
PVCs, then created a new PVC with the original name (i.e. `jenkins`),
bound to the iSCSI PV.  While doing this, Longhorn, for some reason,
kept re-creating the PVC whenever I would delete it, no matter how I
requested the deletion.  Deleting the PV, the PVC, or the Volume, using
either the Kubernetes API or the Longhorn UI, they would all get
recreated almost immediately.  Fortunately, there was actually enough of
a delay after deleting it before Longhorn would recreate it that I was
able to create the new PVC manually.  Once I did that, Longhorn seemed
to give up.
2024-06-23 09:53:15 -05:00