The term _controller_ has a specific meaning in Kubernetes context, and
this process doesn't really fit it. It doesn't monitor any Kubernetes
resources, custom or otherwise. It does use Kubernetes as a data store
(via the lease), but I don't really think that counts. Anyway, the term
_coordinator_ fits better in my opinion.
If evicting a pod fails with an HTTP 239 Too Many Requests error, it
means there is a PodDisruptionBudget that prevents the pod from being
deleted. This can happen, for example, when draining a node that has
Longhorn volumes attached, as Longhorn creates a PDB for its instance
manager pods on such nodes. Longhorn will automatically remove the PDB
once there are no workloads on that node that use its Volumes, so we
must continue to evict other pods and try evicting the failed pods again
later. This behavior mostly mimics what `kubectl drain` does to handle
this same condition.
Whenever a lock request is made for a host that is a node in the current
Kubernetes cluster, the node will now be cordoned and all pods evicted
from it. The HTTP request will not return until all pods are gone,
making the lock request suitable for use in a system shutdown step.
This commit introduces two HTTP path operations:
* POST /api/v1/lock: Acquire a reboot lock
* POST /api/v1/unlock: Release a reboot lock
Both operations take a _multipart/form-data_ or
_application/x-www-form-urlencoded_ body with a required `hostname`
field. This field indicates the name of the host acquiring/releasing
the lock. the `lock` operation also takes an optional `wait` field. If
this value is provided with a `false` value, and the reboot lock cannot
be acquired immediately, the request will fail with an HTTP 419
conflict. If a `true` value is provided, or the field is omitted, the
request will block until the lock can be acquired.
Locking is implemented with a Kubernetes Lease resource using
Server-Side Apply. By setting the field manager of the `holderIdentity`
field to match its value, we can ensure that there are no race
conditions in acquiring the lock; Kubernetes will reject the update if
both the new value and the field manager do not match. This is
significantly safer than a more naïve check-then-set approach.
Since the API provided by this service is intended to be used on the
command line e.g. with `curl`, we need our responses to have a trailing
newline. This ensures that, when used interactively, the next shell
prompt is correctly placed on a new line, and when used
non-interactively, line-buffered output is correctly flushed (i.e. to a
log file).