Having name overrides for in-cluster services breaks ACME challenges,
because the server tries to connect to the Service instead of the
Ingress. To fix this, we need to configure both _cert-manager_ and
_step-ca_ to *only* resolve names using the network-wide DNS server.
It turns out, `step ca renew` _can_ renew certificates without mTLS; it
has a `--mtls=false` command-line argument that configures it to use
a JWT signed by the certificate, instead of using the certificate at
the transport layer. This allows clients to renew their certificates
without needing another authentication mechanism, even with the
TLS-terminating proxy.
By default, step-ca issues certificates that are valid for only one day.
This means that clients need to have multiple renew attempts scheduled
throughout the day, otherwise, missing one could mean having their
certificates expire. This is unnecessary, and not even possible in all
cases, so let's make the default validity period longer and avoid the
issue.
Although most libraries support ED25519 signatures for X.509
certificates, Firefox does not. This means that any certificate signed
by DCH CA R3 cannot be verified by the browser and thus will always
present a certificate error.
I want to migrate internal services that do not need certificates
that are trusted by default (i.e. they are only accessed programatically
or only I use them in the browser) back to using an internal CA instead
of the public *pyrocufflink.net* wildcard certificate. For applications
like Frigate and UniFi Network, these need to be signed by a CA that
the browser will trust, so the ED25519 certificate is inappropriate.
Thus, I've decided to migrate back to DCH CA R2, which uses an EdDSA
signature, and can therefore be trusted by Firefox, etc.
I never ended up using _Step CA_ for anything, since I was initially
focused on the SSH CA feature and I was unhappy with how it worked
(which led me to write _SSHCA_). I didn't think about it much until I
was working on deploying Grafana Loki. For that project, I wanted to
use a certificate signed by a private CA instead of the wildcard
certificate for _pyrocufflink.blue_. So, I created *DCH CA R3* for that
purpose. Then, for some reason, I used the exact same procedure to
fetch the certificate from Kubernetes as I had set up for the
_pyrocufflink.blue_ wildcard certificate, as used by Frigate. This of
course defeated the purpose, since I could have just as easily used
the wildcard certificate in that case.
When I discovered that Grafana Loki expects to be deployed behind a
reverse proxy in order to implement access control, I took the
opportunity to reevaluate the certificate issuance process. Since a
reverse proxy is required to implement the access control I want (anyone
can push logs but only authenticated users can query them), it made
sense to choose one with native support for requesting certificates via
ACME. This would eliminate the need for `fetchcert` and the
corresponding Kubernetes API token. Thus, I ended up deciding to
redeploy _Step CA_ with the new _DCH CA R3_ for this purpose.
[Step CA] is an open-source online X.509 and SSH certificate authority
service. It supports issuing certificates via various protocols,
including ACME and its own HTTP API via the `step` command-line utility.
Clients can authenticate using a variety of methods, such as JWK, Open
ID Connect, or mTLS. This makes it very flexible and easy to introduce
to an existing ecosystem.
Although the CA service is mostly stateless, it does have an on-disk
database where stores some information, notably the list of SSH hosts
for which it has signed certificates. Most other operations, though, do
not require any persistent state; the service does not keep track of
every single certificate it signed, for example. It can be configured
to store authentication information (referred to as "provisioners") in
the database instead of the configuration file, by enabling the "remote
provisioner management" feature. This has the advantage of being able
to modify authentication configuration without updating a Kubernetes
ConfigMap and restarting the service.
The official Step CA documentation recommends using the `step ca init`
command initialize a new certificate authority. This command performs a
few steps:
* Generates an ECDSA key pair and uses it to create a self-signed root
certificate
* Generates a second ECDSA key pair and signs an intermediate CA
certificate using the root CA key
* Generates an ECDSA key pair and SSH root certificate
* Creates a `ca.json` configuration file
These steps can be performed separately, and in fact, I created the
intermediate CA certificate and signed it with the (offline) *dch Root
CA* certificate.
When the service starts for the first time, because
`authority/enableAdmin` is `true` and `authority/provisioners` is empty,
a new "Admin JWK" provisioner will be created automatically. This key
will be encrypted with the same password used to encrypt the
intermediate CA certificate private key, and can be used to create other
provisioners.
[Step CA]: https://smallstep.com/docs/step-ca/