1
0
Fork 0
Commit Graph

157 Commits (89516ebf55b452df6f608a4f0ea97ff35c521536)

Author SHA1 Message Date
Dustin 5e251153c7 cert-manager: Install cert-manager
*cert-manager* manages certificates.  More specifically, it is an ACME
client, which generates certificate-signing requests, submits them to a
certificate authority, and stores the signed certificate in Kubernetes
secrets.  The certificates it manages are defined by Kubernetes
Custom Resources, either defined manually or automatically for Ingress
resources with particular annotations.

The *cert-manager* deployment consists primarily of two services:
*cert-manager* itself, which monitors Kubernetes resources and manages
certificate requests, and the *cert-manager-webhook*, which validates
Kubernetes resources for *cert-manager*.  There is also a third
component, *cainjector*, we do not need it.

The primary configuration for *cert-manager* is done through Issuer and
ClusterIssuer resources.  These define how certificates are issued: the
certificate authority to use and how to handle ACME challenges.  For our
purposes, we will be using ZeroSSL to issue certificates, verified via
the DNS.01 challenge through BIND running on the gateway firewall.
2023-05-01 20:22:35 -05:00
Dustin 4952e6f278 storage: Upgrade Longhorn to v1.4.1 2023-04-24 23:21:55 -05:00
Dustin 572ea54dd3 authelia: Set OIDC consent duration
By default, Authelia requires the user to explicitly consent to allow
an application access to personal information *every time the user
authenticates*.  This is rather annoying, so luckily, it provides a
way to remember the consent for a period of time.
2023-04-23 15:56:50 -05:00
Dustin b5574fa5fc authelia: Skip scanserv-js auth for internal
For convenience, clients on the internal network do not need to
authenticate in order to access *scanserv-js*.  There isn't anything
particularly sensitive about this application, anyway.
2023-04-23 15:55:42 -05:00
Dustin 24465dc7da authelia: Set up OIDC for k8s API server
Enabling OpenID Connect authentication for the Kubernetes API server
will allow clients, particularly `kubectl` to log in without needing
TLS certificates and private keys.
2023-04-22 21:37:23 -05:00
Dustin bcb54d4010 authelia: Add README 2023-04-22 21:35:28 -05:00
Dustin b2e1e29087 authelia: Enable two-factor auth for Paperless-ngx 2023-04-22 08:00:19 -05:00
Dustin 5b99e94809 scanservjs: ingress: Increase proxy read timeout
*scanserv-js* blocks the HTTP request while waiting for a scan to
complete.  For large, multi-page documents, the scan can take several
minutes.  To prevent the request from timing out and interrupting the
scan, we need to increase the proxy timeout configuration.
2023-04-20 17:40:58 -05:00
Dustin d3671818fc scanservjs: Add config overrides for PIXMA G7020
The Canon PIXMA G7020 reports the supported dimensions of the flatbed,
but its automatic document feeder supports larger paper sizes.
Fortunately, *scanserv-js* provides a (somewhat kludgey) mechanism to
override the reported settings with more appropriate values.
2023-04-20 17:38:58 -05:00
Dustin b9b3c4762b phpipam: Update to v1.5.2
We don't need to build our own container image anymore, since the new
*pyrocufflink.blue* domain controllers use LDAPS certificates signed by
Let's Encrypt.
2023-04-20 13:59:30 -05:00
Dustin 1c31c01688 scanservjs: Deploy scanserv-js
*scanserv-js* is a web-based front-end for SANE.  It allows scanning
documents from a browser.

Using the `config.local.js` file, we implement the `afterScan` hook to
automatically upload scanned files to *paperless-ngx* using its REST
API.
2023-04-19 21:29:14 -05:00
Dustin 8a966a7ffb authelia: Enable OIDC provider
Authelia can act as an Open ID Connect identity provider.  This allows
it to provide authentication/authorization for other applications
besides those inside the Kubernetes cluster using it for Ingress
authentication.

To start with, we'll configure an OIDC client for Jenkins.
2023-01-25 10:36:22 -06:00
Dustin e38245dc63 authelia: Add startup probe
I am not entirely sure why, but it seems like the Kubelet *always*
misses the first check in the readiness probe.  This causes a full
60-second delay before the Authelia pod is marked as "ready," even
though it was actually ready within a second of the container starting.

To avoid this very long delay, during which Authelia is unreachable,
even though it is working fine, we can add a startup probe with a much
shorter check interval.  The kubelet will not start readiness probes
until the startup probe returns successfully, so it won't miss the first
one any more.
2023-01-25 10:32:30 -06:00
Dustin 48ed48752f paperless-ngx: Deploy application
*Paperless-ngx* is a document management system.  It provides tools for
organizing, indexing, and searching documents, including OCR.
2023-01-13 21:33:14 -06:00
Dustin df12690958 storage: Use Authelia for Longhorn UI auth
Instead of using a static username/password and HTTP Basic
authentication for the Longhorn UI, we can now use Authelia via the
*nginx* auth subrequest functionality.
2023-01-13 21:33:14 -06:00
Dustin 42bc4ae187 authelia: Install Authelia
Authelia is a general authentication provider that works (primarily)
by integrating with *nginx* using its subrequest mechanism.  It works
great with Kubernetes/*ingress-nginx* to provide authentication for
services running in the cluster, especially those that do not provide
their own authentication system.

Authelia needs a database to store session data.  It supports various
engines, but since we're only running a very small instance with no real
need for HA, SQLite on a Longhorn persistent volume is sufficient.

Configuration is done mostly through a YAML document, although some
secret values are stored in separate files, which are pointed to by
environment variables.
2023-01-13 21:33:14 -06:00
Dustin ce0440a33c ntfy: Allow notification attachments
*ntfy* allows notifications to include arbitrary file attachments.  For
images, it will even show them in the notification.  In order to support
this, the server must be configured with a writable filesystem location
to cache the files.
2023-01-13 09:41:10 -06:00
Dustin b13479a297 jenkins: Remove dockerconfigjson
This is no longer necessary.
2022-12-28 11:05:40 -06:00
Dustin 06f7c55911 ntfy: Deploy ntfy.sh
*ntfy* is a simple but powerful push notification service.
2022-12-18 17:43:47 -06:00
Dustin 8440c2a486 hudctrl: Update for v0.2.0
Version 0.2.0 of the HUD Controller is stateful.  It requires writable
storage for its configuration file, as it updates the file when display
settings and screen URLs are changed.

While we're making changes, let's move it to its own namespace.
2022-12-18 16:26:07 -06:00
Dustin 1d199a0e75 autoscaler: Tolerate control-plane taint
Kubernetes 1.24 introduced a new taint for Control Plane nodes that must
be tolerated in addition to the original taint in order for pods to be
scheduled to run on such nodes.
2022-12-16 17:20:22 -06:00
Dustin 10ee364612 jenkins: Add ssh_known_hosts ConfigMap
When cloning/fetching a Git repository in a Jenkins pipeline, the Git
Client plugin uses the configured *Host Key Verification Strategy* to
verify the SSH host key of the remote Git server.  Unfortunately, there
does not seem to be any way to use the configured strategy from the
`git` command line in a Pipeline job, so e.g. `git push` does not
respect it.  This causes jobs to fail to push changes to the remote if
the container they're using does not already have the SSH host key for
the remote in its known hosts database.

This commit adds a ConfigMap to the *jenkins-jobs* namespace that can be
mounted in containers to populate the SSH host key database.
2022-12-10 12:19:33 -06:00
Dustin 889cd29a3c jenkins: Update to 2.375.1
I don't want Jenkins updating itself whenever the pod restarts, so I'm
going to pin it to a specific version.  This way, I can be sure to take
a snapshot of the data volume before upgrading.
2022-12-02 22:15:11 -06:00
Dustin b8ccbd0b09 jenkins: Avoid SELinux relabel of data dir
Setting a static SELinux level for the container allows CRI-O to skip
relabeling all the files in the persistent volume each time the
container starts.  For this to work, the pod needs a special annotation,
and CRI-O itself has to be configured to respect it:

```toml
[crio.runtime.runtimes.runc]
allowed_annotations = ["io.kubernetes.cri-o.TrySkipVolumeSELinuxLabel"]
```

This *dramatically* improves the start time of the Jenkins container.
Instead of taking 5+ minutes, it now starts instantly.

https://github.com/cri-o/cri-o/issues/6185#issuecomment-1334719982
2022-12-01 21:35:02 -06:00
Dustin 2c794a9399 Merge branch 'jenkins' 2022-11-25 13:41:51 -06:00
Dustin 404fadc68a jenkins: Run Jenkins in Kubernetes
Running Jenkins in Kubernetes is relatively straightforward.  The
Kubernetes plugin automatically discovers all the connection and
authentication configuration, so a `kubeconfig` file is no longer
necessary.  I did set the *Jenkins tunnel* option, though, so that
agents will connect directly to the Jenkins JNLP port instead of going
through the ingress controller.

Jobs now run in pods in the *jenkins-job* namespace instead of the
*jenkins* namespace.  The latter is now where the Jenkins controller
runs, and the controller should not have permission to modify its own
resources.
2022-11-25 13:38:10 -06:00
Dustin 61378e9724 dynk8s: Fix Ingress routing
I guess I thought `defaultBackend` was scoped to the TLS host, but it
appears to be global, across all Ingress resources in the cluster.
Thus, it really doesn't make any sense for any Ingress to have a default
backend, and certainly not the dynk8s provisioner.
2022-11-24 11:14:01 -06:00
Dustin 19ad5023b8 jenkins: Restrict role permissions
Jenkins doesn't really need full control of all resources in its
namespace.  Rather, it only needs to be able to manage Pod and
PersistentVolumeClaim resources.
2022-11-18 13:52:25 -06:00
Dustin 668b5bf5a9 kitchen: Allow Jenkins to restart deployment
Jenkins is now allowed to restart the Deployment named *kitchen* in the
*kitchen* namespace.  It will do this after pushing a new container
image from a build of the *master* branch.
2022-11-06 17:22:46 -06:00
Dustin de054bd68f kitchen: Add manifest for kitchen screen server
I decided to run the kitchen screen service in Kubernetes rather than on
the Raspberry Pi in the kitchen.  This will hopefully make it a bit more
reliable and easier to update.  It will also make it easier to rebuild
the OS on the Pi, if it ever becomes necessary, since it really only
needs Firefox (and MQTTDPMS) now.
2022-11-05 16:39:22 -05:00
Dustin 5208902706 metrics: Add role to allow anon access to metrics
By default, the Kubernetes metrics endpoints are restricted.  I don't
think they're worth protecting with authentication, so I've added a
cluster role/binding to allow anonymous access to them.
2022-11-05 16:23:02 -05:00
Dustin 6df6e552b7 longhorn: Remove node selector labels
I originally added the `du5t1n.me/storage` label to the x86_64 nodes and
configured Longhorn to only run on nodes with those labels because I
thought that was the correct way to control where volume replicas are
stored.  It turns out that this was incorrect, as it prevented Longhorn
from running on non-matching nodes entirely.  Thus, any machine that was
not so labeled could not access any Longhorn storage volumes.

The correct way to limit where Longhorn stores volume replicas is to
enable the `create-default-disk-labeled-nodes` setting.  With this
setting enabled, Longhorn will run on all nodes, but will not create
"disks" on them unless they have the
`node.longhorn.io/create-default-disk` label set to `true`.  Nodes that
do not have "disks" will not store volume replicas, but will run the
other Longhorn components and can therefore access Longhorn volumes.

Note that changing the "default settings" ConfigMap does not change the
setting once Longhorn has been deployed.  To update the setting on an
existing installation, the setting has to be changed explicitly:

```sh
kubectl get setting -n longhorn-system -o json \
    create-default-disk-labeled-nodes \
    | jq '.value="true"' \
    | kubectl apply -f -
```
2022-10-11 21:58:43 -05:00
Dustin a683505e5d dynk8s-provisioner: Add manifest 2022-10-11 21:58:22 -05:00
Dustin 3755aaab6f autoscaler: Add manifest for Cluster Autoscaler 2022-10-11 21:57:59 -05:00
Dustin 5f2aaefc35 stroage: Set default storage class
Setting a default storage class allows PersistentVolumes to be declared
without selecting a specific storage class in each object spec.
2022-08-23 21:21:54 -05:00
Dustin 76875e3dbf storage: Show how to create admin password secret 2022-08-23 21:21:43 -05:00
Dustin 8f6373fb70 storage: Fix typo in node selector 2022-08-23 21:21:22 -05:00
Dustin 7bd7dc7b18 ingress: Show how to import cert as secret 2022-08-23 21:20:47 -05:00
Dustin 102d1fb919 setup: ks: Generate iSCSI initiator name
The iSCSI initiator needs a unique name.  It will generate one the first
time it starts if one does not already exist.  Since it tries to write
it to a file under `/etc`, this will fail, since the root filesystem is
read-only.  As such, we need to generate the name during installation,
when the filesystem is still writable.
2022-08-23 21:22:01 -05:00
Dustin be2f0e5f72 prom_speedtest: Add application manifest
The Raspberry Pi is too slow to run the speed test and get accurate
results.
2022-08-06 22:21:06 -05:00
Dustin 52a6481733 hudctr: Add manifest for Basement HUD controller 2022-08-02 21:46:32 -05:00
Dustin 6c7dcce90b setup: switch back to ext4 on lvm
Originally, I decided to use *btrfs* subvolumes to create writable
directories inside otherwise immutable locations, such as for
`/etc/cni/net.d`, etc.  I figured this would be cleaner than
bind-mounting directories from `/var`, and would avoid the trouble of
determining an appropriate volume sizes necessary to make them each
their own filesystem.

Unfortunately, it turns out that *cri-o* may still have some issues with
its *btrfs* storage driver.  One [blog post][0] hints at performance
issues in *containerd*, and it seems they may apply to *cri-o* as well.
I certainly encountered performance issues when attempting to run `npm`
in a Jenkins job running in a Kubernetes pod.  There is definitely a
[performance issue with `npm`][1] when running in a container, which may
or may not have been exacerbated by the *btrfs* storage driver.

In any case, upstream [does not reecommend][2] using the *btrfs* driver,
performance notwithstanding.  The *overlay* driver is much more widely
used and tested.  Plus, it's easier to filter out container layers from
filesystem usage statistics simply by ignoring *overlay* filesystems.

[0]: https://blog.cubieserver.de/2022/dont-use-containerd-with-the-btrfs-snapshotter/
[1]: https://github.com/npm/cli/issues/3208#issuecomment-1002990902
[2]: https://github.com/containers/storage/issues/929
2022-07-31 17:09:03 -05:00
Dustin c7a3477c9e setup: Convert tabs to spaces 2022-07-31 01:40:16 -05:00
Dustin bd8ae87036 setup: Fix typo in README 2022-07-31 01:39:54 -05:00
Dustin 4cce8df62d README: Add storage section 2022-07-31 01:38:46 -05:00
Dustin 157353ddb0 phpipam: Add manifest for phpipam 2022-07-31 01:31:53 -05:00
Dustin 2a07a7856f docker-distribution: Deploy OCI image registry
We're going to need a place to store custom container images to run on
the Kubernetes cluster!

This is my first from-scratch manifest!
2022-07-31 01:15:01 -05:00
Dustin 9b86a117ef storage: Add manifest for Longhorn
I was originally going to use GlusterFS to provide persistent storage
for pods, but [Heketi][0], the component that provides the API for
the Kubernetes StorageClass, is in "deep maintenance" status and looks
to be practically dead.  I was a bit afraid to try to use it because of
that, and went looking for guidance on Reddit, which is how I discovered
Longhorn.
2022-07-31 00:57:53 -05:00
Dustin 30cbc568d0 ingress: Add manifest for ingress-nginx
This manifest deploys the *ingress-nginx* controller, which is
responsible for handing traffic from clients outside the cluster and
routing it to the proper pods.  I am using host network mode here to
avoid having to have another proxy in front of the ingress controller,
which would be required in NodePort mode.

I looked at MetalLB briefly, but decided to avoid it for now.  As with
everything else in the Kubernetes world, it seems massively complex.
2022-07-31 00:57:12 -05:00
Dustin ac4d9c1f21 jenkins: Fix typo in README 2022-07-31 00:42:42 -05:00