All checks were successful
dustin/k8s-reboot-coordinator/pipeline/head This commit looks good
Found another race condition: If the first pod evicted is deleted quickly, before any other pods are evicted, the wait list will become empty immediately, causing the `wait_drained` function to return too early. I've completely rewritten the `drain_node` function (again) to hopefully handle all of these races. Now, it's purely reactive: instead of getting a list of pods to evict ahead of time, it uses the `Added` events of the watch stream to determine the pods to evict. As soon as a pod is determined to be a candidate for eviction, it is added to the wait list. If eviction fails of a pod fails irrecoverably, that pod is removed from the wait list, to prevent the loop from running forever. This works because `Added` events for all current pods will arrive as soon as the stream is opened. `Deleted` events will start arriving once all the `Added` events are processed. The key difference between this implementation and the previous one, though, is when pods are added to the wait list. Previously, we only added them to the list _after_ they were evicted, but this made populating the list too slow. Now, since we add them to the list _before_ they are evicted, we can be sure the list is never empty until every pod is deleted (or unable to be evicted at all).