kubernetes

infra/kubernetes

Fork 0

Commit Graph

Author	SHA1	Message	Date
Dustin C. Hatch	3d40424cf7	fleetlock: Use patched server from Github PR The _fleetlock_ server drains all pods from a node before allocating the reboot lock to that node. Unfortunately, it doesn't actually wait for those pods to be completely evicted. If some pods take too long to shut down, they may get stuck in `Terminating` state once the machine starts rebooting. This makes it so those pods cannot be replaced on another node with the original one is offline, which pretty much defeats the purpose of using Fleetlock in the first place. It seems upstream has abandoned this project, as there is an open [Pull Request][0] to fix this issue that has so far been ignored. Fortunately, building a new container image containing the patch is easy enough, so we can run our own patched build. [0]: https://github.com/poseidon/fleetlock/pull/271	2024-11-05 07:05:55 -06:00
Dustin C. Hatch	fc66058251	fleetlock: Deploy Zincati fleet lock manager [fleetlock] is an implementation of the Zincati FleetLock reboot coordination protocol. It only works for machines that are Kubernetes nodes, but it does enable safe rolling updates for those machines. Specifically, when a node acquires a lock (backed by a Kubernetes Lease), it cordons that node and evicts pods from it. After the node has rebooted into the new version of Fedora CoreOS, it uncordons the node and releases the lock. [fleetlock]: https://github.com/poseidon/fleetlock	2024-05-31 15:18:01 -05:00

Author

SHA1

Message

Date

Dustin C. Hatch

3d40424cf7

fleetlock: Use patched server from Github PR

The _fleetlock_ server drains all pods from a node before allocating the
reboot lock to that node.  Unfortunately, it doesn't actually wait for
those pods to be completely evicted.  If some pods take too long to shut
down, they may get stuck in `Terminating` state once the machine starts
rebooting.  This makes it so those pods cannot be replaced on another
node with the original one is offline, which pretty much defeats the
purpose of using Fleetlock in the first place.

It seems upstream has abandoned this project, as there is an open [Pull
Request][0] to fix this issue that has so far been ignored.
Fortunately, building a new container image containing the patch is easy
enough, so we can run our own patched build.

[0]: https://github.com/poseidon/fleetlock/pull/271

2024-11-05 07:05:55 -06:00

Dustin C. Hatch

fc66058251

fleetlock: Deploy Zincati fleet lock manager

[fleetlock] is an implementation of the Zincati FleetLock reboot
coordination protocol.  It only works for machines that are Kubernetes
nodes, but it does enable safe rolling updates for those machines.
Specifically, when a node acquires a lock (backed by a Kubernetes
Lease), it cordons that node and evicts pods from it.  After the node
has rebooted into the new version of Fedora CoreOS, it uncordons the
node and releases the lock.

[fleetlock]: https://github.com/poseidon/fleetlock

2024-05-31 15:18:01 -05:00

2 Commits