Setup Keepalived

Introduction

Now that you’ve setup a Nomad cluster and deployed your first set of containers, it’s time to start thinking the next component of resilient infrastructure: Virtual IP addresses (referred to as VIPs). Sometimes also called ‘floating IP addresses’, VIPs allow applications/services that are not DNS aware to point to an IP address and know that IP will always connect to a specified service. If that service moves from one cluster node to another, the VIP will move with it. The solution for this in Linux is a service known as ‘keepalived’. While initially it might not seem necessary, you will quickly see its value.

You may see references to a term ‘vrrp’, which refers to the Virtual Router Redundancy Protocol. The VRRP is a computer networking protocol that provides for automatic assignment of AP addresses to participating hosts. This is the protocol that ‘keepalived’ leverages.

Environment

For the sake of this guide, assume there are three cluster nodes with addressing such as:

node1 - 10.0.3.111
node2 - 10.0.3.112
node3 - 10.0.3.113

Performing actions as root

The ideal security model dictates that interactively operating as root is incorrect, and that operations should run as a user, using ‘sudo’ to elevate permissions where necessary. Unfortunately, almost everything that needs to be done here will require ‘sudo’, so it will be faster to just become root and run everything as root:

sudo su -

Perform the following steps on each node until instructed otherwise

Installing ‘keepalived’

Install ‘keepalived’:

apt install keepalived -y

Perform the following steps on one node until instructed otherwise

Managing ‘keepalived’

Instead of manually managing the ‘keepalived’ configuration on each node, a slightly clever solution will be setup to mange management easy and centralized. This is probably not the most elegant solution, but it works and it will save time. If you come up with a better solution, please share it.

Create the required directories:

mkdir -p /mnt/vagabond/services/nomad-vrrp
mkdir /mnt/vagabond/services/nomad-vrrp/nodes.d
mkdir /mnt/vagabond/services/nomad-vrrp/keepalived.d
cd /mnt/vagabond/services/nomad-vrrp

Create the systemd service that will be installed on each node to monitor for updates to the ‘keepalived’ configuration:

wget https://raw.githubusercontent.com/digital-dann/nomad-cluster/main/nomad-vrrp/nomad-vrrp.service

Create the script that will be run by the service to monitor for updates:

wget https://raw.githubusercontent.com/digital-dann/nomad-cluster/main/nomad-vrrp/nomad-vrrp.sh
chmod u+x /mnt/vagabond/services/nomad-vrrp/nomad-vrrp.sh
nano /mnt/vagabond/services/nomad-vrrp/nomad-vrrp.sh

Make sure to change the INTERFACE variable to match the name of the physical network interface on your cluster node. You can determine your physical network interface names with this command:
find /sys/class/net -type l -not -lname 'virtual' -printf '%f\n'

Create the update script which will be run by the service to pull the ‘keepalived’ configuration, stored centrally on the gluster filesystem:

wget https://raw.githubusercontent.com/digital-dann/nomad-cluster/main/nomad-vrrp/update-vrrp.sh
chmod u+x /mnt/vagabond/services/nomad-vrrp/update-vrrp.sh

Create the installer script:

wget https://raw.githubusercontent.com/digital-dann/nomad-cluster/main/nomad-vrrp/install.sh
chmod u+x /mnt/vagabond/services/nomad-vrrp/install.sh

Create the deploy script:

wget https://raw.githubusercontent.com/digital-dann/nomad-cluster/main/nomad-vrrp/deploy.sh
chmod u+x /mnt/vagabond/services/nomad-vrrp/deploy.sh
nano /mnt/vagabond/services/nomad-vrrp/deploy.sh

Make sure to change the node names and IP addresses in the arrNodes array to match the nodes in your cluster.

Create the base ‘keepalived’ configuration:

cd /mnt/vagabond/services/nomad-vrrp/keepalived.d
wget https://raw.githubusercontent.com/digital-dann/nomad-cluster/main/nomad-vrrp/keepalived.d/keepalived.conf

Perform the following steps on each node until instructed otherwise

cd /mnt/vagabond/services/nomad-vrrp
./install.sh

At this point, the ‘nomad-vrrp’ service should be installed on each node. The service is monitoring the ‘/mnt/vagabond/services/nomad-vrrp/nodes’ location for the existance of a uniquely named file (sometimes referred to as a flag file) that matches the hash of a combination of hostname, IP address and salt (a static string of text) for that cluster node. When the file is created (by running ‘deploy.sh’) and detected by the service, the ‘update-vrrp.sh’ script is executed which updates the forwarding rules, updates the ‘keepalived’ configuration on the node and restarts the ‘keepalived’ service to load the new configuration.

Perform the following steps on one node until instructed otherwise

Create the flag files to trigger the updates on all nodes:

cd /mnt/vagabond/services/nomad-vrrp
./deploy.sh

Within ten seconds, each node will update.

Deploy a VIP for Adguard

You may ask yourself: “Why does Adguard need a VIP?” The answer is pretty simple: You may want to either configure your DHCP server to use your Adguard service for internal DNS or setup a query forwarding rule on your router for specific domains to your internal DNS. Either way, how can you ensure you always direct traffic to the IP address of the node hosting Adguard? Creating a VIP for Adguard ensures that you can configure either situations with the VIP, knowing that the VIP address will always be mapped to the node with the Adguard service.

Create the Adguard ‘keepalived’ configuration:

cd /mnt/vagabond/services/nomad-vrrp/keepalived.d
wget https://raw.githubusercontent.com/digital-dann/nomad-cluster/main/nomad-vrrp/keepalived.d/adguard.vrrp

In this example, the VIP being configured is 10.0.3.201. Replace this with your own chosen VIP.

nano adguard.vrrp

Make sure to change the interface value to match the name of the physical network interface on your cluster node. Also make sure you change the virtual_ipaddress value to match what you want to use for a VIP for Adguard. The virtual_router_id value must also be unique for each VIP.

Create the adguard ‘keepalived’ script:

wget https://raw.githubusercontent.com/digital-dann/nomad-cluster/main/nomad-vrrp/keepalived.d/adguard-check.sh
nano adguard-check.sh

Make sure to change the INTERFACE variable to match the name of the physical network interface on your cluster node. You can determine your physical network interface names with this command:
find /sys/class/net -type l -not -lname 'virtual' -printf '%f\n'

Create the flag files to trigger the updates on all nodes:

cd /mnt/vagabond/services/nomad-vrrp
./deploy.sh

Within ten seconds, each node will update. You can verify by checking connectivity to the Adguard service with this command:

dig adguard.home.digitaldann.net @10.0.3.201

How do I utilize this new DNS service?

There are two ways that I would suggest using this new highly available DNS and DNS VIP:

  1. Route all DNS traffic (53/udp) to the Adguard VIP on port 53/udp. Block all outbound DNS traffic at the router with a destination of port 53/udp. Aduard will use DNS over HTTPS (DoH) to resolve all external domain queries and use DNS Rewrite rules with Consul DNS to resolve all internal domain queries. Set DHCP to use the Adguard VIP for DNS and ensure all static network configurations use the Adguard VIP for DNS. If possible, create a redirect rule at the router for outbound 53/udp traffic and send it to the Adguard VIP. This is my preferred solution.
  2. Configure the router to filter DNS queries and redirect all queries for internal domains to the Adguard VIP. All other DNS traffic will go out to the DNS provider configured out the router. All outbound DNS traffic not sourced from the router should be blocked. This gives you internal DNS resolution, but does not provide ad blocking.

Summary

What just happened?

  • A systemd service was installed on each node to monitor for the existance of a flag file on the glusterfs
    • ‘nomad-vrrp.service’ defines the service
    • ‘nomad-vrrp.sh’ is the script run by the service
  • A script was created to generate the flag files (in the /nodes.d sub-directory) for all nodes to trigger the service
    • ‘deploy.sh’ is the script that creates the flag files for all nodes defined in the script
  • When the flag file is detected by the service running on a node, an update script is to deploy the ‘keepalived’ configuration
    • The ‘keepalived’ configs are copied from the /keepalived.d sub-directory
    • The ‘keepalived’ service is restarted to reload the configuration

What are the steps to update VIPs or forwarding rules?

  • Add or update a ‘keepalived’ configuration (vrrp+script) in the /keepalived.d sub-directory. Examples can be found on the Github repo.
  • Run ‘./deploy.sh’