infra/kickstart/pipeline/head This commit looks goodDetails
* Install `system-upgrade` plugin for `dnf`, since we'll almost always
want this in order to be able to update hosts
* Do not install _sshca-cli-systemd_; this package has been deprecated
and removed in favor of setting up the systemd units from Ansible
* Install _python3-libdnf5_, as this is required by Ansible and will be
installed by it later, so we can save a bit of time by always having
it installed.
Now that we're using Jinja to render the kickstart scripts, we can
separate out scripts, systemd unit files, etc. into their own files and
`include` them. This makes editing them much easier, especially since
syntax highlighting will work correctly.
infra/kickstart/pipeline/head This commit looks goodDetails
Every time the job runs, the _Publish_ stage changes the timestamps of
the files on the server, even if their contents haven't changed. This
is because each build runs from a fresh checkout, so every file appears
to have just been created. To avoid this, and leave files on the server
alone unless they've changed, we now set the modification timestamp of
every file from its last commit.
infra/kickstart/pipeline/head This commit looks goodDetails
Anaconda seems to want to install this by default now. This is a
useless package with a bunch of security vulnerabilities and a hard
dependency on Polkit.
The drawback to the native `%include` Kickstart directive is that it
requires a static, hard-coded, absolute path. This means that we
cannot, for example, host a copy of the kickstarts from a different
branch for testing, without modifying the URLs of all the included
files.
Switching to using Jinja templates introduces a build step, but the
result is that the artifacts are self-contained. This way, they can be
deployed anywhere. I'm not sure where I'll put them, though, and
they'll need a Jenkins job to run the build and publish them.
When the SSH daemon is already configured to use an SSH host
certificate but the specified certificate file does not exist, then the
server will not try to use it later once it is created. This
essentially means that the certificate obtained during first boot will
not be used untile the SSH daemon is restarted.
Rather than try to set all of this up in the kickstart, it's probably
better to just let Ansible do it. Then, the SSH daemon can be restarted
as needed automatically (by the host provisioner).
To initiate the automatic host provisioning process, a new machine must
trigger the _POST /host/online_ webhook. Included in the request are
the hostname of the new machine and its SSH host public keys.
Optionally, the request can also contain the name of a branch in the
configuration policy repository. For virtual machines, this branch
name can be specified by a QEMU `fw_cfg` option. The `fw_cfg` values in
sysfs are only readable by root, so the service must run as root, but
it does not need any additional privileges, so we can use systemd
sandbox features to restrict it.
This feature is enabled by default for virtual machines. I haven't
quite figured out how to do the branch selection for physical machines
yet, but I will enable it for them once I do.
Delaying the _ssh-host-cert-sign@.service_ units starting until after
the clock is synchronized ends up causing _sshd.service_ to start way
before the host certififcates are available. This prevents the SSH
daemon from using the host certificates until it is explicitly reloaded,
so clients will not be able to verify the server's authenticity
automatically on first boot. To ensure that clients (read: Ansible)
will be able to connect to the server when it first boots without any
manual interaction, we need to delay the _sshd.service_ unit starting
until the certificate files are present.
I think this can actually happen to any server, not just a Raspberry Pi,
but it definitely always happens on Pis. I may eventually apply this
change to the `ssh-host-cert-sign@.service` template unit file in the
_sshca-cli-systemd_ package, if it turns out to be a more common
problem.
This will allow the `fedora-rpi-common.ks` kickstart fragment to be more
composeable, making it usable for systems other than "servers" that may
need a different disk layout.
Machines that use eMMC/SD cards for OS storage need a slightly different
disk layout than those with nVME drives. Notably, we do not want swap
or `/tmp` on the eMMC, as that will not really improve performance at
all and will be hard on the flash memory.
For NVMe, there are two options available, with and without a swap
volume.
On machines without an RTC, the clock will likely be very wrong on first
boot when system tries to obtain the initial SSH host certificates.
This results in the SSHCA server rejecting the request because the
authorization token has expired. To avoid this, we need to ensure the
clock is set before attempting to have the certificates signed.
Apparently something is populating `/etc/machine-id` at install time
now, which prevents units scheduled to run on first boot (with
`ConditionFirstBoot=true`) from starting.