Commit Graph

70 Commits (c51589adffd0ac50be0ed70fcee8a3358762257f)

Author SHA1 Message Date
Dustin c51589adff gw1: Scrape BIND DNS server logs
The BIND server on the firewall is configured to write query logs and
RPZ rewrite logs to files under `/var/log/named`.  We can scrape these
logs with Promtail and use the messages for analytics on the DNS-based
firewall, etc.
2024-02-28 19:06:23 -06:00
Dustin b96164ce11 gw1: Allow rpm.grafana.com via proxy
In order to install Promtail on machines (e.g. *unifi1*) that do not
have direct access to the Internet.
2024-02-22 20:40:51 -06:00
Dustin 39400f3b2f hosts: Remove vars for zbx0.p.b
This machine is long dead.
2024-02-22 10:23:19 -06:00
Dustin 1bff9b2649 gw1: Enable pam_ssh_agent_auth for sudo
This machine is _not_ a member of the _pyrocufflink.blue_ AD domain, so
it does not inherit the settings from that group.  Also, Jenkins does
not manage it, so only my personal keys are authorized.
2024-01-28 12:16:35 -06:00
Dustin be63424fd8 hosts: Deploy Squid on gw1
Running Squid on the firewall makes sense; it's a sort of layer-7
firewall, after all.  There's not much storage on that machine, though
so we don't really want to cache anything.  In fact, it's only purpose
is to allow very limited web access for certain applications.  All
outbound traffic is blocked, with two exceptions:

* Fedora package repositories (for the UniFi controller server)
* Google Fonts (for Invoice Ninja)
2024-01-27 20:09:34 -06:00
Dustin 7b54bc4400 nut-monitor: Require both UPS to be online
Unfortunately, the automatic transfer switch does not seem to work
correctly.  When the standby source is a UPS running on battery, it does
*not* switch sources if the primary fails.  In other words, when the
power is out and both UPS are running on battery, when the first one
dies, it will NOT switch to the second one.  It has no trouble switching
when the second source is mains power, though, which is very strange.

I have tried messing with all the settings including nominal input
voltage, sensitivity, and frequency tolerence, but none seem to have any
effect.

Since it is more important for the machines to shut down safely than it
is to have an extra 10-15 minutes of runtime during an outage, the best
solution for now is to configure the hosts to shut down as soon as the
first UPS battery gets low.  This is largely a waste of the second UPS,
but at least it will help prevent data loss.
2024-01-25 21:22:04 -06:00
Dustin 764177daf3 vmhost0: Shut down when first UPS goes low battery
The automatic transfer switch does not seem to work reliably when both
UPS sources are running on battery.  This means all systems lose power
after the first UPS battery dies, even though the second UPS is still
online.  To minimize the risk of data loss, at least until I figure out
what's wrong, I want both VM hosts to shut down as soon as the first UPS
signals that its battery is low.
2024-01-22 08:46:32 -06:00
Dustin 423951bac1 {burp1, gw1}: Configure upsmon 2024-01-19 21:55:36 -06:00
Dustin d0b0f2ff38 hosts: gw1: Deploy BURP, collectd
Although *gw1* is not really managed by Ansible, it is much easier to
deploy collectd and BURP with the existing playbooks.
2024-01-19 20:52:48 -06:00
Dustin 525f2b2a04 nut-monitor: Configure upsmon
`upsmon` is the component of [NUT] that monitors (local or remote) UPS
devices and reacts to changes in their state.  Notably, it is
responsible for powering down the system when there is insufficient
power to the system.
2024-01-19 20:50:03 -06:00
Dustin 686817571e smtp-relay: Switch to Fastmail
AWS is going to begin charging extra for routable IPv4 addresses soon.
There's really no point in having a relay in the cloud anymore anyway,
since a) all outbound messages are sent via the local relay and b) no
messages are sent to anyone except me.
2023-10-24 17:27:21 -05:00
Dustin a3ea838cac burp-server: Deploy MinIO
We're going to run MinIO on the BURP server to provide a backup target
for the [Postgres Operator][0]/[WAL-E][1].  Although the Postgres
Operator also supports backups via [WAL-G][2], which supports more
backup targets like SFTP, the operator does not support restoring from
those targets.  As such, the best way to get fully-featured backups for
the Postgres Operator, including environment cloning, etc., is to use
S3.  Since I absolutely do not want to store my backups "in the cloud,"
using MinIO seems a decent alternative.  Running it on the BURP server
allows the backups to be stored and rotated along with regular system
backups.

[0]: https://github.com/zalando/postgres-operator/
[1]: https://github.com/wal-e/wal-e
[2]: https://github.com/wal-g/wal-g
2023-05-09 21:55:25 -05:00
Dustin 9921b2fd5e burp1.p.b: Set collectd SELinux domain permissive
Using the *md* plugin generates AVC denials like this:

	type=AVC msg=audit(1681259123.636:338441): avc:  denied  { read } for  pid=1438759 comm="collectd" name="md1" dev="devtmpfs" ino=646 scontext=system_u:system_r:collectd_t:s0 tcontext=system_u:object_r:fixed_disk_device_t:s0 tclass=blk_file permissive=0
2023-04-11 19:26:25 -05:00
Dustin f16c2fae2f burp1.p.b: Enable md and thermal collectd plugins
The BURP storage volume is now backed by a Linux MD RAID array, so we
want to monitor its state.  Furthermore, since this machine is a
physical device, we should monitor its thermal characteristics as well.
2023-04-11 10:14:18 -05:00
Dustin 45148421b0 smtp1.p.b: Allow SMTP relay from Kubernetes network
Applications running on the Kubernetes cluster need to be able to send
e-mail via the relay.
2023-01-13 19:36:20 -06:00
Dustin 57702bb9c7 hosts: vmhost[01]: Update static DNS server address 2022-12-18 20:19:32 -06:00
Dustin e09e684fd8 hosts: Update mtrcs0 FQDN
I moved the metrics Pi from the red network to the blue network.  I
started to get uncormfortable with the firewall changes that were
required to host a service on the red network.  I think it makes the
most sense to define the red network as egress only.
2022-11-09 18:56:05 -06:00
Dustin 5a9b9a8d98 mtrcs0: Remove Ansible user/become settings
Jenkins still connects as *jenkins* and uses `sudo`, so we can't
hard-code the user to *root*.
2022-08-12 13:22:47 -05:00
Dustin 7ac5493b63 smtp1.p.b: Allow SMTP relay from pyrocufflink.red
AlertManager running on *mtrcs0.pyrocufflink.red* needs to be able to
send e-mail through the SMTP relay.
2022-08-11 21:43:48 -05:00
Dustin 4ddbc9f256 hosts: Add mtrcs0.p.r
*mtrcs0.pyrocufflink.red* is a Raspberry Pi CM4 on a Waveshare
CM4-IO-BASE-B carrier board with a NVMe SSD.  It runs a custom OS built
using Buildroot, and is not a member of the *pyrocufflink.blue* AD
domain.

*mtrcs0.p.r* hosts Victoria Metrics/`vmagent`, `vmalert`, AlertManager,
and Grafana.  I've created a unique group and playbook for it,
*metricspi*, to manage all these applications together.
2022-08-11 21:40:19 -05:00
Dustin c9dbaa32b9 collectd: Control SELinux domain permissiveness
It seems with each new release of Fedora, some feature or other of
*collectd* gets broken.  In Feodra 36, the *interfaces* plugin does not
seem to work reliably, and the *md* plugin logs a *lot* of errors.
While these issues are investigated upstream, we either need to manage
our own policy for collectd or mark the `collectd_t` domain permissive.
I chose the latter because I'm lazy and I don't consider collectd to be
that big of a threat to security.
2022-07-24 10:35:32 -05:00
Dustin 797cc2092f hosts: Add nvr1.p.b
*nvr1.pyrocufflink.blue* is the new video recording server.  It is a
1U rack-mounted physical machine based on the [Jetway
JBC150F596-3160-B][0] barebone system.  It replaces
*nvr0.pyrocufflink.blue* in this role.

[0]: https://www.jetwaycomputer.com/JBC150F596.html
2022-07-23 17:52:26 -05:00
Dustin 87e24aba3f hosts: hass2.p.b: Enable collectd thermal plugin
This plugin reads Raspberry Pi SoC temperature data.
2022-07-21 12:37:16 -05:00
Dustin 3f99708c48 cloud0: burp backup paths
Nextcloud data are no longer stored at `/var/www/html` since switching
to the Fedora-packaged distribution.
2021-12-17 20:22:42 -06:00
Dustin 6c705f54af hosts: vmhost1: Switch to systemd-networkd
Using *systemd-networkd* to configure network interfaces on *vmhost0* is
working really well.  It is decidedly more stable than *dhcpcd* was, and
certainly easier to work with than NetworkManager.  Let's go ahead and
switch *vmhost1* as well.
2021-10-31 01:12:25 -05:00
Dustin 881c8de625 Switch Prometheus/collectd to pull
Transitioning from push-based to pull-based monitoring with
Prometheus/collectd.  The *write_prometheus* plugin will be installed on
all hosts, and Prometheus will be configured to scrape them directly.
2021-10-30 16:41:17 -05:00
Dustin d8919f6424 hosts: dns0: Allow DDNS updates from gw1
Since the firewall is now the DHCP server, the DNS server needs to allow
it to send DDNS updates for *pyrocufflink.red*.
2021-10-17 14:12:19 -05:00
Dustin 3f49175c1d host: vmhost0: Set host-specific network config
*vmhost0.pyrocufflink.blue* no longer uses `dhcpcd` for network
configuration, but *systemd-networkd*.

The host-specific network settings for a VM host include the
configuration for the management interface, as well as the configuration
of the physical ports that make up the bonded interfaces.
2021-10-10 16:09:15 -05:00
Dustin b7ba6a59ab hosts: Add nvr0.p.b
*nvr0.pyrocufflink.blue* hosts Frigate.  It is deployed on a separate
subnet, for two reasons:

* To avoid streaming video from the cameras through the firewall
* To prevent any hosts on the LAN except Home Assistant from
  communicating with Frigate, since it does not have any kind of
  authentication or access control
2021-08-21 17:20:19 -05:00
Dustin bbfb66b49f Merge branch 'collectd-vmhost' 2021-07-24 18:39:06 -05:00
Dustin 207c9d6428 hosts: vmhost{0,1}: Configure collectd server
The VM hosts have multiple network interfaces with IPv6 addresses, so
collectd may not always choose the correct one to send metrics.  Thus we
have to explicitly tell it to use the management interface, to avoid it
sending data on the SAN interface.
2021-07-24 18:37:18 -05:00
Dustin 3998b08b10 homeassistant: Apply hass-dhcp role
Applying the *hass-dhcp* role the Home Assistant server, making it the
authoritative DHCP and DNS server for the home automation network.
2021-07-24 18:34:50 -05:00
Dustin b826d8355e hosts: Add hass2.p.b
*hass2.pyrocufflink.blue* is a Raspberry Pi Compute Module 4-based
system, currently mounted in a WaveShare CM4 Mini Base Board (A).  With
an NVMe SSD for primary storage, it runs significantly faster than a
standard Raspberry Pi 4, and blows the old Raspberry Pi 3-based Home
Assistant deployment out of the water. It has a Zooz 700 series Z-Wave
Plus S2 USB stick and a ConBee II Zigbee USB stick attached to its USB
2.0 ports.  It runs a customized Fedora Minimal distribution.
2021-07-19 15:58:58 -05:00
Dustin 71f55ddfdf hosts: hass1: Set collectd network interface
Because *hass1.pyrocufflink.blue* has multiple interfaces, collectd does
not know which interface it should use to send multicast metrics
messages.  To force it to use the wired interface, which is connected to
the default internal ("blue") network, the `interface` property needs to
be set.
2020-12-23 20:57:01 -06:00
Dustin 84313601ef roles/named: Implement response policy zones
BIND response policy zones (RPZ) support provides a mechanism for
overriding the responses to DNS queries based on a wide range of
criteria.  In the simplest form, a response policy zone can be used to
provide different responses to different clients, or "block" some DNS
names.

For the Pyrocufflink and related networks, I plan to use an RPZ to
implement ad/tracker blocking.  The goal will be to generate an RPZ
definition from a collection of host lists (e.g. those used by uBlock
Origin) periodically.

This commit introduces basic support for RPZ configuration in the
*named* role.  It can be activated by providing a list of "response
policy" definitions (e.g. `zone "name"`) in the `named_response_policy`
variable, and defining the corresponding zones in `named_zones`.
2020-09-06 10:40:01 -05:00
Dustin 44404950c1 Merge branch 'graylog' into master 2020-08-31 20:17:12 -05:00
Dustin 40c8df1b13 hosts: cloud0: Configure backups with BURP
Back up `/var/www/html`.
2020-08-29 14:22:17 -05:00
Dustin da3eb1aaf0 hosts: hass1: Configure backups with BURP
Back up `/var/lib/homeassistant`.
2020-08-29 14:22:17 -05:00
Dustin 9ef88da95f hosts: hassdb0: Add missing vars file 2020-08-29 14:01:50 -05:00
Dustin 4c661478b2 hosts: bw0: Use Lego cert 2020-03-17 08:45:34 -05:00
Dustin cd1cf38774 hosts: git0: Switch to Lego wildcard cert 2020-02-22 16:43:46 -06:00
Dustin e25b9a2e8e hosts: Add logs0.p.b
*logs0.pyrocufflink.blue* hosts Graylog
2019-10-28 18:47:09 -05:00
Dustin fab662bd53 hosts: hass0: Add untracked host_vars file 2019-09-19 19:50:35 -05:00
Dustin b2cc467581 hosts: Add build0-amd64
*build0-amd64.securepassage.com* is a Jenkins agent that runs Docker,
allowing pipeline jobs to run inside containers.
2019-09-19 19:50:35 -05:00
Dustin e3e30eea1c hosts: dns0: Update DHCP server address
Now that the DHCP server has moved from *dns1* to *dns0*, the DNS server
needs to be updated to allow DDNS updates from the latter.
2019-09-19 19:27:30 -05:00
Dustin 9306252e75 hosts: Add bw0.p.b
*bw0.pyrocufflink.blue* runs Bitwarden_rs via Docker.
2019-09-19 19:27:30 -05:00
Dustin f002da86ef dns0: Update DHCP server IP address
DHCP is provided by *dns1.pyrocufflink.blue* now, not the gateway. To
allow dynamic DNS updates from it, the correct source address must be
listed in the zone configuration for *pyrocufflink.red*.
2019-02-19 13:20:19 -06:00
Dustin 284e3817e0 jenkins0: Bind Samba to real interface only
Because *jenkins0.pyrocufflink.blue* runs Docker, it has an extra
virtual interface and IP address, for container communication. By
default, Samba registers all IP addresses in DNS, and cannot
differentiate between the actual interface and the Docker bridge. This
can cause other hosts to attempt to contact *jenkins0.pyrocufflink.blue*
using the wrong address.

The `samba_interfaces` variable controls the value of the `interfaces`
global configuration option for Samba. One of the things this option
controls is which addresses to register in DNS. By setting it to the
network address of the *pyrocufflink.blue* network, we can prevent the
virtual address from being used at all.
2019-01-06 12:24:52 -06:00
Dustin 1745f268de smtp1: Allow relay from Management network 2018-10-13 11:50:31 -05:00
Dustin 07a23267c6 hosts: Add dns1.pyrocufflink.blue
To avoid having a single point of failure, a second recursive DNS server
is necessary. This will be useful in cases where the VM hosts must both
be taken offline, but Internet access is still required.

The new server, *dns1.pyrocufflink.blue*, has all the same zones defined
as the original. It forwards the *pyrocufflink.blue* zone and
corresponding reverse zones to the domain controllers, and acts as a
slave for the *pyrocufflink.red* zone.
2018-08-12 17:24:37 -05:00