Because *vmhost1.pyrocufflink.blue* is usually sleeping, continuous
enforcement jobs always fail. By keeping it in a separate inventory
file, configuration policy can still be applied to it manually, but it
will be ignored by continuous enforcement.
*myala.pyrocufflink.jazz* no longer hosts any public-facing websites,
and is in fact shut down. To prevent HAproxy from failing to start
because it cannot resolve the name, this backend needs to be removed.
Usually, the *samba* role is deployed as a dependency of the *winbind*
role, which explicitly sets `samba_security` to `ads`. The new
*fileserver* role also depends on the *samba* role, but it does NOT sett
that variable. This can cause `smb.conf` to be rewritten with a
different value whenever one or the other role is applied.
Explicitly setting the `samba_security` variable at the group level
ensures that the value is consistent no matter how the *samba* role is
applied. Since all domain member machines need the same value,
regardless of what function they perform, this is safe.
The *fileserver* role configures Samba as a file sharing server. It uses
the *samba* role to handle cross-distribution installation of Samba
itself, and is focused primarily on configuring shared folders.
This commit adds an *after* ordering dependency on the network device
unit to the *wait-global-address@.service* template unit. Without this
dependency, the service will wait forever for a global address if the
device does not exist. With the dependency, though, if the device does
not appear within the default timeout, the wait service will never
start, causing all dependent services to fail, but allowing the boot
process to continue.
The *websites/darkchestofwonders.us* role prepares a machine to host
http://darkchestofwonders.us/. The website itself is published via rsync
by Jenkins.
The *websites/dustin.hatch.name* role configures a server to host
http://dustin.hatch.name/. The role only applies basic configuration;
the actual website application is published by Jenkins.
The Apache service needs to be reloaded after the *certbot* role
configures it to serve the `/.well-known/acme-challenge` path, so that
the changes can take effect before the `certbot` command is run to
request the certificate(s).
If another role that depends on the *apache* role accidentally creates
an invalid configuration, it will be impossible to correct it by
subsequent invocations of its playbook. This is because the *apache*
role always tries to start the service, which will fail if the
configuration is invalid, thus aborting the playbook. With this early
abort, there is no way for later tasks to correct the error.
Playbooks that include the *apache* role should have a task that is
executed after all the roles have been applied to ensure the service is
running.
The *dch-storage-net* role configures a machine to connect the storage
network and mount shared folders from the storage appliance.
The `wait-global-address.sh` script and corresponding
*wait-global-address@.service* systemd unit template are necessary to
ensure that the storage network is actually available before attempting
to mount the shared volumes. This is particularly important at boot,
since `dhcpcd` does not implement any kind of signaling that can be used
by *network-online.target*, so the network is considered "online" as
soon as the `dhcpcd` process has started. This typically results in
"network unreachable" errors.
The *net-ifaces* role manages a script that creates virtual network
interfaces, such as bridge, bond, and VLAN, that `dhcpcd`/`dhclient`
alone cannot. This provides a lightweight alternative to
*systemd-networkd* and *NetworkMangager*.
Though the default for the `fqdn` value is listed as `both` in
*dhcpcd.conf(5)*, the current behavior of `dhcpcd` suggests that it may
actually be `none`. Without explicitly setting `fqdn both`, the value of
the kernel node name is sent as-is in the *hostname* option (12). If the
node name is set to the FQDN, then dynamic DNS gets broken, since the
DHCP server always appends its domain name to the provided hostname.
Setting `fqdn both` causes `dhcpcd` to send the FQDN in the *FQDN*
option (81), which the DHCP server interprets correctly.
Using a list to specify the values for the `allowinterfaces` and
`denyinterfaces` parameters in `dhcpcd.conf` makes the configuration
policy cleaner and more type-safe.
Today I realized that `dhcpcd` has been logging several hundred thousand
of these messages every second:
libudev: received NULL device
This was causing both `dhcpcd` and `systemd-journald` to consume 100%
CPU.
I am not entirely sure what a "device management" module is in the
context of `dhcpcd`, but it does not seem to be required. Setting the
`nodev` option in `dhcpcd.conf` suppresses the messages, and seems to
have no effect on the operation of the daemon.
Traffic from the management network is not allowed except for specific
services. NTP is required of course, for time synchronization with the
pyrocufflink.blue domain controllers. RADIUS is necessary for WiFi
authentication, which is also handled by the DCs.
The UniFi controller has been moved to a Raspberry Pi on the Management
network. This machine needs a static address to use in the "inform URL"
it sends to managed devices.
The Management network (VLAN 10, 172.30.0.240/28) will be used for
communication with and configuration of network devices including
switches and access points. This keeps configuration separate from
normal traffic, and allows complete isolation of infrastructure devices.
The `ifconfig` global directive specifies the IP address added to the
tunnel interface device, not the network. The `push route` directives
need to include this address to correctly send route information to
clients.