Commit Graph

5 Commits (1db158c1505cd67538cf76700373529b7398f0ad)

Author SHA1 Message Date
Dustin 878ff7acb5 loki: Deploy Caddy in front of Loki
Grafana Loki explicitly eschews built-in authentication.  In fact, its
[documentation][0] states:

> Operators are expected to run an authenticating reverse proxy in front
> of your services.

While I don't really want to require authentication for agents sending
logs, I definitely want to restrict querying and viewing logs to trusted
users.

There are _many_ reverse proxy servers available, and normally I would
choose _nginx_.  In this case, though, I decided to try Caddy, mostly
because of its built-in ACME support.  I wasn't really happy with how
the `fetchcert` system turned out, particularly using the Kubernetes API
token for authentication.  Since the token will eventually expire, it
will require manual intervention to renew, thus mostly defeating the
purpose of having an auto-renewing certificate.  So instead of using
_cert-manager_ to issue the certificate and store it in Kubernetes, and
then having `fetchcert` download it via the Kubernetes API, I set up
_step-ca_ to handle issuing the certificate directly to the server. When
Caddy starts up, it contacts _step-ca_ via ACME and handles the
challenge verification automatically.  Further, it will automatically
renew the certificate as necessary, again using ACME.

I didn't spend a lot of time optimizing the Caddy configuration, so
there's some duplication there (i.e. the multiple `reverse_proxy`
statements), but the configuration works as desired.  Clients may
provide a certificate, which will be verified against the trusted issuer
CA.  If the certificate is valid, the client may access any Loki
resource.  Clients that do not provide a certificate can only access the
ingestion path, as well as the "ready" and "metrics" resources.

[0]: https://grafana.com/docs/loki/latest/operations/authentication/
2024-02-21 07:47:51 -06:00
Dustin 45c35c065a promtail: Deploy Loki Promtail Agent
[Promtail][0] is the log collection agent for Grafana Loki.  It reads
logs from various locations, including local files and the _systemd_
journal and sends them to Loki via HTTP.

Loki configuration is a highly-structured YAML document.  Thus, instead
of using Tera template syntax for loops, conditionals, etc., we can use
the full power of CUE to construct the configuration.  Using the
`Marshal` function from the built-in `encoding/yaml` package, we
serialize the final configuration structure as a string and write it
verbatim to the configuration file.

I have modeled most of the Promtail configuration schema in the
`du5t1n.me/cfg/app/promtail/schema` package.  Having the schema modeled
will ensure the generated configuration is valid during development
(i.e. `cue export` will fail if it is not), which will save time pushing
changes to machines and having Loki complain.

The `#promtail` "function" in `du5t1n.me/cfg/env/prod` makes it easy to
build our desired configuration.  It accepts an optional `#scrape`
field, which can be used to provide specific log scraping definitions.
If it is unspecified, the default configuration is to scrape the systemd
journal.  Hosts with additional needs can supply their own list,
probably including the `promtail.scrape.journal` object in it to get the
default journal scrape job.

[0]: https://grafana.com/docs/loki/latest/send-data/promtail/
2024-02-18 11:35:13 -06:00
Dustin 011058aec3 loki: Use fetchcert to manage server certificate
Before going into production with Grafana Loki, I want to set it up to
use TLS.  To that end, I have configured _cert-manager_ to issue it a
certificate, signed by _DCH CA_.  In order to use said certificate,
we need to configure `fetchcert` to run on the Loki server.
2024-02-18 11:35:13 -06:00
Dustin ffe450cd30 loki: Run Grafana Loki in a container
Deploying Loki is pretty straightforward.  It just needs a container
unit file and a basic YAML configuration file.
2024-02-13 19:54:48 -06:00
Dustin 45285b9c47 host: Add loki0.p.b
*loki0.pyrocufflink.blue* will host [Grafana Loki][0], a log aggregation
system.

[0]: https://grafana.com/oss/loki/
2024-02-13 16:55:05 -06:00