kubernetes

infra

kubernetes

Fork 0

Commit Graph

Author	SHA1	Message	Date
Dustin	98651cf9d9	jenkins: Force iSCSI volume on specific nodes Instead of routing iSCSI traffic from the Kubernetes network, through the firewall, to the storage network, nodes now have a second network adapter connected to directly to the storage network. The nodes with such an adapter are labelled `network.du5t1n.me/storage`, so we can pin the Jenkins PersistentVolume to them via a node affinity rule.	2024-06-26 18:29:49 -05:00
Dustin	7f3287297b	jenkins: Migrate to iSCSI persistent volume Managing the Jenkins volume with Longhorn has become increasingly problematic. Because of its large size, whenever Longhorn needs to rebuild/replicate it (which happens often for no apparent reason), it can take several hours. While the synchronization is happening, the entire cluster suffers from degraded performance. Instead of using Longhorn, I've decided to try storing the data directly on the Synology NAS and expose it to Kubernetes via iSCSI. The Synology offers many of the same features as Longhorn, including snapshots/rollbacks and backups. Using the NAS allows the volume to be available to any Kubernetes node, without keeping multiple copies of the data. In order to expose the iSCSI service on the NAS to the Kubernetes nodes, I had to make the storage VLAN routable. I kept it as IPv6-only, though, as an extra precaution against unauthorized access. The firewall only allows nodes on the Kubernetes network to access the NAS via iSCSI. I originally tried proxying the iSCSI connection via the VM hosts, however, this failed because of how iSCSI target discovery works. The provided "target host" is really only used to identify available LUNs; follow-up communication is done with the IP address returned by the discovery process. Since the NAS would return its IP address, which differed from the proxy address, the connection would fail. Thus, I resorted to reconfiguring the storage network and connecting directly to the NAS. To migrate the contents of the volume, I temporarily created a PVC with a different name and bound it to the iSCSI PersistentVolume. Using a pod with both the original PVC and the new PVC mounted, I used `rsync` to copy the data. Once the copy completed, I deleted the Pod and both PVCs, then created a new PVC with the original name (i.e. `jenkins`), bound to the iSCSI PV. While doing this, Longhorn, for some reason, kept re-creating the PVC whenever I would delete it, no matter how I requested the deletion. Deleting the PV, the PVC, or the Volume, using either the Kubernetes API or the Longhorn UI, they would all get recreated almost immediately. Fortunately, there was actually enough of a delay after deleting it before Longhorn would recreate it that I was able to create the new PVC manually. Once I did that, Longhorn seemed to give up.	2024-06-23 09:53:15 -05:00

Author

SHA1

Message

Date

Dustin

98651cf9d9

jenkins: Force iSCSI volume on specific nodes

Instead of routing iSCSI traffic from the Kubernetes network, through
the firewall, to the storage network, nodes now have a second network
adapter connected to directly to the storage network.  The nodes with
such an adapter are labelled `network.du5t1n.me/storage`, so we can pin
the Jenkins PersistentVolume to them via a node affinity rule.

2024-06-26 18:29:49 -05:00

Dustin

7f3287297b

jenkins: Migrate to iSCSI persistent volume

Managing the Jenkins volume with Longhorn has become increasingly
problematic.  Because of its large size, whenever Longhorn needs to
rebuild/replicate it (which happens often for no apparent reason), it
can take several hours.  While the synchronization is happening, the
entire cluster suffers from degraded performance.

Instead of using Longhorn, I've decided to try storing the data directly
on the Synology NAS and expose it to Kubernetes via iSCSI.  The Synology
offers many of the same features as Longhorn, including
snapshots/rollbacks and backups.  Using the NAS allows the volume to be
available to any Kubernetes node, without keeping multiple copies of
the data.

In order to expose the iSCSI service on the NAS to the Kubernetes nodes,
I had to make the storage VLAN routable.  I kept it as IPv6-only,
though, as an extra precaution against unauthorized access.  The
firewall only allows nodes on the Kubernetes network to access the NAS
via iSCSI.

I originally tried proxying the iSCSI connection via the VM hosts,
however, this failed because of how iSCSI target discovery works.  The
provided "target host" is really only used to identify available LUNs;
follow-up communication is done with the IP address returned by the
discovery process.  Since the NAS would return its IP address, which
differed from the proxy address, the connection would fail.  Thus, I
resorted to reconfiguring the storage network and connecting directly
to the NAS.

To migrate the contents of the volume, I temporarily created a PVC with
a different name and bound it to the iSCSI PersistentVolume.  Using a
pod with both the original PVC and the new PVC mounted, I used `rsync`
to copy the data.  Once the copy completed, I deleted the Pod and both
PVCs, then created a new PVC with the original name (i.e. `jenkins`),
bound to the iSCSI PV.  While doing this, Longhorn, for some reason,
kept re-creating the PVC whenever I would delete it, no matter how I
requested the deletion.  Deleting the PV, the PVC, or the Volume, using
either the Kubernetes API or the Longhorn UI, they would all get
recreated almost immediately.  Fortunately, there was actually enough of
a delay after deleting it before Longhorn would recreate it that I was
able to create the new PVC manually.  Once I did that, Longhorn seemed
to give up.

2024-06-23 09:53:15 -05:00

2 Commits (9977bb3de4b12bc82871f6a0cad19ec5d5bb63fe)