dynk8s-provisioner

Commit Graph

Author	SHA1	Message	Date
Dustin	ff67ddf8bf	tf/asg: Update node template to Fedora 41 dustin/dynk8s-provisioner/pipeline/head There was a failure building this commit Details	2025-07-05 11:06:32 -05:00
Dustin	e30b03dad5	tf/userdata: Remove all CNI conflist files CRI-O now installs more `.conflist` files in `/etc/cni/net.d`. Their presence interferes with Calico, so they need to be deleted in order to have fully working Pod networking, especially for pods that start very early (before Calico is completely ready).	2024-11-19 11:55:29 -06:00
Dustin	dbcda4a8ca	tf/userdata: Configure CRI-O to use crun dustin/dynk8s-provisioner/pipeline/head There was a failure building this commit Details By default, CRI-O uses `runc` as the container runtime. `runc` does not support user namespaces, though, so we have to use `crun`, which does.	2024-11-03 12:34:40 -06:00
Dustin	f531b03e7c	tf/userdata: Use IMDSv2 tokens The Fedora 40 AMIs require IMDSv2. Our `kubeadm-join` script therefore needs to fetch the auth token and include it with metada requests.	2024-11-03 12:31:27 -06:00
Dustin	0ec109b088	tf/asg: Update to Fedora 40 Upstream changed the naming convention for Fedora AMIs. It also seems they've stopped publishing "release" artifacts; all the AMIs are now date-stamped. We should probably consider running `terraform apply` periodically to keep up-to-date.	2024-11-03 12:31:11 -06:00
Dustin	c63c4d9e8c	tf/userdata: Taint node for Jenkins only dustin/dynk8s-provisioner/pipeline/head This commit looks good Details If a Jenkins job runs for a while, Kubernetes may schedule other Pods on it eventually. If a long-running Pod gets assigned to the ephemeral node, the Cluster Autoscaler won't be able to scale down the ASG. To prevent this, we apply a taint to the node so normal Pods will not get assigned to it. We have to apply the corresponding toleration to Pods for Jenkins jobs.	2024-02-13 07:52:54 -06:00
Dustin	925d22b9d2	tf/userdata: Provision instance storage The c7gd.xlarge instance type has a directly-attached NVMe disk. Let's use it for Kubernetes Pod storage to increase performance a bit.	2024-02-13 07:50:43 -06:00
Dustin	6f279430c2	tf/asg: Use larger instance type I'd rather spend a few extra pennies on beefier ephemeral worker nodes to speed up builds.	2024-02-13 07:41:05 -06:00
Dustin	3c4f84e039	tf/userdata: Remove default CRI-O CNI config Fedora AMIs have the default locale set to en_US.UTF-8, which sorts `100-crio-bridge.conflist` before `10-calico.conflist`. As a result, Pods end up with incorrect network configuration, and cannot be reached from other Pods on the container network. Since we do not need the default configuration, the easiest way to resolve this is to just delete it.	2024-02-05 20:58:31 -06:00
Dustin	c4f73073dc	tf/asg: Increase root block device size The default root block device for Fedora EC2 instances is only 10 GiB. This is insufficient for many jobs, especially those that build large container images.	2024-02-05 20:53:38 -06:00
Dustin	f6910f04df	tf/asg: Add CA resource tag for FUSE device plugin dustin/dynk8s-provisioner/pipeline/head This commit looks good Details Jenkins jobs that build container images in user namespaces need access to `/dev/fuse`, which is provided by the [fuse-device-plugin][0]. This plugin runs as a DaemonSet, which updates the status of the node it's running on when it starts to indicate that the FUSE device is available. When scaling up from zero nodes, Cluster Autoscaler has no way to know that this will occur, and therefore cannot determine that scaling up the ASG will create a node with the required resources. Thus, the ASG needs a tag to inform CA that the nodes it creates will indeed have the resources and scaling it up will allow the pod to be scheduled. Although this feature of CA was added in 1.14, it apparently got broken at some point and no longer works in 1.22. It works again in 1.26, though. [0]: https://github.com/kuberenetes-learning-group/fuse-device-plugin/tree/master	2024-01-14 11:42:46 -06:00
Dustin	5a79680b22	tf/userdata: Install CRI-O from Fedora base The cri-o package has moved from its own module into the base Fedora repository, as Fedora is [eliminating modules][0]. The last modular version was 1.25, which is too old to run pods with user namespaces. Version 1.26 is available in the base repository, which does support user namespaces. [0]: https://fedoraproject.org/wiki/Changes/RetireModularity	2024-01-13 10:10:46 -06:00
Dustin	02772f17dd	tf/asg: Look up Fedora AMI by attributes Instead of hard-coding the AMI ID of the Fedora build we want, we can use the `aws_ami` data source to search for it. The Fedora release team has a consistent naming scheme for AMIs, so finding the correct one is straightforward.	2023-11-13 20:27:50 -06:00
Dustin	473e279a18	tf/userdata: Remove default DNS configuration Lately, cloud nodes seem to be failing to come up more frequently. I traced this down to the fact that `/etc/resolv.conf` in the `kube-proxy` container contains both the AWS-provided DNS server and the on-premises server set by Wireguard. This evidently "works" correctly sometimes, but not always. When it doesn't, the `kube-proxy` cannot resolve the Kubernetes API server address, and thus cannot create the necessary netfilter rules to forward traffic correctly. This causes pods to be unable to communicate. I am not entirely sure what the "correct" solution to this problem would be, since there are various issues in play here. Fortunately, cloud nodes are only ever around for a short time, and never need to be rebooted. As such, we can use a "quick fix" and simply remove the AWS-provided DNS configuration.	2023-11-13 19:52:57 -06:00
Dustin	4a2a376409	terraform: Update node template to Fedora 38	2023-11-13 19:52:47 -06:00
Dustin	83b8c4a7cc	userdata: Set kubelet config path The default configuration for the kubelet.service unit does not specify the path to the `config.yml` generated by `kubeadm`. Thus, any settings defined in the `kublet-config` ConfigMap do not take effect. To resolve this, we have to explicitly set the path in the `config` property of the `kubeletExtraArgs` object in the join configuration.	2023-11-13 19:49:32 -06:00
Dustin	c4cabfcdbc	terraform: Update node template to Fedora 37 dustin/dynk8s-provisioner/pipeline/head This commit looks good Details	2023-06-11 20:22:44 -05:00
Dustin	2f0f134223	terraform: userdata: Add Longhorn issue workaround dustin/dynk8s-provisioner/pipeline/head This commit looks good Details There's apparently a bug in open-iscsi (see [issue #4988](https://github.com/longhorn/longhorn/issues/4988)) that prevents Longhorn from working on Fedora 36+. We need a SELinux policy patch to work around it.	2023-01-10 21:09:46 -06:00
Dustin	b01841ab72	terraform: Update node template to Fedora 36 dustin/dynk8s-provisioner/pipeline/head Something is wrong with the build of this commit Details	2023-01-10 17:19:20 -06:00
Dustin	37cbcba662	examples: Add Kubernetes manifest dustin/dynk8s-provisioner/pipeline/head This commit looks good Details The `dynk8s-provisioner.yaml` file contains an example of how to deploy the dynk8s-provisioner in Kubernetes using `kubectl`.	2022-10-11 21:52:05 -05:00
Dustin	e11f98b430	terraform: Add config for auto-scaling group The Cluser Autoscaler uses EC2 Auto-Scaling Groups to configure the instances it launches when it determines additional worker nodes are necessary. Auto-Scaling Groups have an associated Launch Template, which describes the properties of the instances, such as AMI ID, instance type, security groups, etc. When instances are first launched, they need to be configured to join the on-premises Kubernetes cluster. This is handled by cloud-init using the configuration in the instance user data. The configuration supplied here specifies the Fedora packages that need to be installed on a Kubernetes worker node, plus some additional configuration required by `kubeadm`, `kubelet`, and/or `cri-o`. It also includes a script that fetches the WireGuard client configuration and connects to the VPN, finalizes the setup process, and joins the cluster.	2022-10-11 21:40:42 -05:00
Dustin	c48076b8f0	test: Adjust k8s roles for integration tests Initially, I thought it was necessary to use a ClusterRole in order to assign permissions in one namespace to a service account in another. It turns out, this is not necessary, as RoleBinding rules can refer to subjects in any namespace. Thus, we can limit the privileges of the dynk8s-provisioner service account by only allowing it access to the Secret and ConfigMap resources in the kube-system and kube-public namespaces, respectively, plus the Secret resources in its own namespace.	2022-10-11 21:08:49 -05:00
Dustin	cd920418aa	events: Delete Node on instance termination dustin/dynk8s-provisioner/pipeline/head This commit looks good Details The Cluster Autoscaler does not delete the Node resource in Kubernetes after it terminates an instance: > It does not delete the Node object from Kubernetes. Cleaning up Node > objects corresponding to terminated instances is the responsibility of > the cloud node controller, which can run as part of > kube-controller-manager or cloud-controller-manager. On-premises clusters are probably not running the Cloud Controller Manager, so Node resources are liable to be left behind after a scale-down event. To keep unused Node resources from accumulating, the dynk8s-provisioner will now delete the Node resource associated with an EC2 instance when it receives a state-change event indicating the instance has been terminated. To identify the correct Node, it compares the value of the `providerID` field of each existing node with the instance ID mentioned in the event. An exact match is not possible, since the provider ID includes the availability zone of the instance, which is not included in the event, however, instances IDs are unique enough that this "should" never be an issue.	2022-10-11 20:00:24 -05:00
Dustin	d85f314a8b	tests: Begin integration tests dustin/dynk8s-provisioner/pipeline/head There was a failure building this commit Details Cargo uses the sources in the `tests` directory to build and run integration tests. For each `tests/foo.rs` or `tests/foo/main.rs`, it creates an executable that runs the test functions therein. These executables are separate crates from the main package, and thus do not have access to its private members. Integration tests are expected to test only the public functionality of the package. Application crates do not have any public members; their public interface is the command line. Integration tests would typically run the command (e.g. using `std::process::Command`) and test its output. Since dynk8s-provisioner is not really a command-line tool, testing it this way would be difficult; each test would need to start the server, make requests to it, and then stop it. This would be slow and cumbersome. In order to avoid this tedium and be able to use Rocket's built-in test client, I have converted dynk8s-provisioner into a library crate that also includes an executable. The library makes the `rocket` function public, which allows the integration tests to import it and pass it to the Rocket test client. The point of integration tests, of course, is to validate the functionality of the application as a whole. This necessarily requires allowing it to communicate with the Kubernetes API. In the Jenkins CI environment, the application will need the appropriate credentials, and will need to use a separate Kubernetes namespace from the production deployment. The `setup.yaml` manifest in the `tests` directory defines the resources necessary to run integration tests, and the `genkubeconfig.sh` script can be used to create the appropriate kubeconfig file containing the credentials. The kubeconfig is exposed to the tests via the `KUBECONFIG` environment variable, which is populated from a Jenkins secret file credential. Note: The `data` directory moved from `test` to `tests` to avoid duplication and confusing names.	2022-10-07 07:37:20 -05:00
Dustin	3e3904cd4f	events: Delete bootstrap tokens on termination When an instance is terminated, any bootstrap tokens assigned to it are now deleted. Though these would expire anyway, deleting them ensures that they cannot be used again if they happened to be leaked while the instance was running. Further, it ensures that attempting to fetch the `kubeadm` configuration for the instance will return an HTTP 404 Not Found response once the instance has terminated.	2022-10-07 06:52:06 -05:00
Dustin	df39fe46eb	routes: Add kubeadm kubeconfig resource The GET /kubeadm/kubeconfig/<instance-id> operation returns a configuration document for `kubeadm` to add the node to the cluster as a worker. The document is derived from the kubeconfig stored in the `cluster-info` ConfigMap, which includes the external URL of the Kubernetes API server and the root CA certificate used in the cluster. The bootstrap token assigned to the specified instance is added to the document for `kubeadm` to use for authentication. The kubeconfig is stored in the ConfigMap as a string, so extracting data from it requires deserializing the YAML document first. In order to access the cluster information ConfigMap, the service account bound to the pod running the provisioner service must have the appropriate permissions.	2022-10-07 06:52:06 -05:00
Dustin	25524d5290	routes: Add WireGuard configuration resource The * GET /wireguard/config/<instance-id>* resource returns the WireGuard client configuration assigned to the specified instance ID. The resource contents are stored in the Kubernetes Secret, in a data field named `wireguard-config`. The contents of this field are returned directly as a string, without any transformation. Thus, the value must be a complete, valid WireGuard configuration document. Instances will fetch and save this configuration when they first launch, to configure their access to the VPN.	2022-10-03 18:29:47 -05:00
Dustin	3f17373624	Change WireGuard keys -> configs Setting up the WireGuard client requires several pieces of information, beyond the node private key and peer's public key. The peer endpoint address/port, peer public key, and node IP address are also required. As such, naming the resource a "key" is somewhat misleading.	2022-10-03 18:20:46 -05:00
Dustin	3916e0eac9	Assign WireGuard keys to EC2 instances In order to join the on-premises Kubernetes cluster, EC2 instances will need to first connect to the WireGuard VPN. The dynk8s provisioner will provide keys to instances to configure their WireGuard clients. WireGuard keys must be pre-configured on the server and stored in Kubernetes as dynk8s.du5t1n.me/wireguard-key Secret resources. They must also have a `dynk8s.du5t1n.me/ec2-instance-id` label. If this label is empty, the key is available to be assigned to an instance. When an EventBridge event is received indicating an instance is now running, a WireGuard key is assigned to that instance (by setting the `dynk8s.du5t1n.me/ec2-instance-id` label). Conversely, when an event is received indicating that the instance is terminated, any WireGuard keys assigned to that instance are freed.	2022-10-01 12:17:32 -05:00
Dustin	25d7be004c	Begin EC2 instance state event handler The lifecycle of ephemeral Kubernetes worker nodes is driven by events emitted by Amazon EventBridge and delivered via Amazon Simple Notification Service. These events trigger the dynk8s provisioner to take the appropriate action based on the state of an EC2 instance. In order to add a node to the cluster using `kubeadm`, a "bootstrap token" needs to be created. When manually adding a node, this would be done e.g. using `kubeadm token create`. Since bootstrap tokens are just a special type of Secret, they can be easily created programmatically as well. When a new EC2 instance enters the "running" state, the provisioner creates a new bootstrap token and associates it with the instance by storing the instance ID in a label in the Secret resource's metadata. The initial implementation of the event handler is rather naïve. It generates a token for every instance, though some instances may not be intended to be used as Kubernetes workers. Ideally, the provisioner would only allocate tokens for instances matching some configurable criteria, such as AWS tags. Further, a token is allocated every time the instance enters the running state, even if a token already exists or is not needed.	2022-10-01 10:34:03 -05:00
Dustin	8e1165eb95	terraform: Begin AWS configuration The `terraform` directory contains the resource descriptions for all AWS services that need to be configured in order for the dynamic K8s provisioner to work. Specifically, it defines the EventBridge rule and SNS topic/subscriptions that instruct AWS to send EC2 instance state change notifications to the dynk8s-provisioner's HTTP interface.	2022-09-27 12:58:51 -05:00
Dustin	c721571043	container: Rebase on Fedora 35 dustin/dynk8s-provisioner/pipeline/head This commit looks good Details Fedora 36 has OpenSSL 3, while the rust container image has OpenSSL 1.1. Since Fedora 35 is still supported, and it includes OpenSSL 1.1, we can use it as our base for the runtime image.	2022-09-11 13:17:54 -05:00
Dustin	c8e0fe1256	ci: Begin Jenkins build pipeline dustin/dynk8s-provisioner/pipeline/head This commit looks good Details	2022-09-10 10:30:54 -05:00
Dustin	ac1b20d910	sns: Save messages to disk Upon receipt of a notification or unsubscribe confirmation message from SNS, after the message signature has been verified, the receiver will now write the re-serialized contents of the message out to the filesystem. This will allow the messages to be inspected later in order to develop additional functionality for this service. The messages are saved in a `messages` director within the current working directory. This directory contains a subdirectory for each SNS topic. Within the topic subdirectories, the each message is saved in a file named with the message timestamp and ID.	2022-09-05 09:45:44 -05:00
Dustin	ab45823654	Begin HTTP server, SNS message receiver This commit introduces the HTTP interface for the dynamic K8s node provisioner. It will serve as the main communication point between the ephemeral nodes in the cloud, sharing the keys and tokens they require in order to join the Kubernetes cluster. The initial functionality is simply an Amazon SNS notification receiver. SNS notifications will be used to manage the lifecycle of the dynamic nodes. For now, the notification receiver handles subscription confirmation messages by following the link provided to confirm the subscription. All other messages are simply written to the filesystem; these will be used to implement and test future functionality.	2022-09-03 22:58:23 -05:00
Dustin	3ce72623e6	model: sns: Add union type The `model::sns::Message` enumeration provides a mechanism for deserializing a JSON document into the correct type. It will be used by the HTTP operation that receives messages from SNS in order to determine the correct action to take in response to the message.	2022-09-03 22:57:07 -05:00
Dustin	196a43c49c	sns: Begin work on Amazon SNS message handling In order to prevent arbitrary clients from using the provisioner to retrieve WireGuard keys and Kubernetes bootstrap tokens, access to those resources must be restricted to the EC2 machines created by the Kubernetes Cloud Autoscaler. The key to the authentication process will be SNS notifications from AWS to indicate when new EC2 instances are created; everything that the provisioner does will be associated with an instance it discovered through an SNS notification. SNS messages are signed using PKCS#1 v1.5 RSA-SHA1, with a public key distributed in an X.509 certificate. To ensure that messages received are indeed from AWS, the provisioner will need to verify those signatures. Messages with missing or invalid signatures will be considered unsafe and ignored. The `model::sns` module includes the data structures that represent SNS messages. The `sns::sig` module includes the primitive operations for implementing signature verification.	2022-09-01 18:22:22 -05:00
Dustin	90e5bd65ca	Initial commit	2022-08-31 21:02:17 -05:00

38 Commits (master) All Branches Search

38 Commits (master)

All Branches