Initializing a k3s cluster with Ansible
In a previous post, I described how I used Terraform and Ansible to provision Debian VMs on Proxmox. In my example, the Terraform project provisioned 3 master nodes and 3 worker nodes. In this post, I will share the playbook for installing and initializing a k3s cluster using a floating IP with keepalived to ensure the Kubernetes API can be reached if any of the master nodes are unavailable.
Directory Structure
In the Ansible repository, I have a k3s_cluster role which is self-contained for everything related to the k3s cluster. That includes the cluster itself as well as all of the applications as sub-roles.
- roles/k3s_cluster/k3s/master initializes the master nodes
- roles/k3s_cluster/k3s/node initializes the worker nodes
- roles/k3s_cluster/download downloads the k3s installer
- roles/k3s_cluster/prereq takes care of any pre-install configuration tasks
- inventory/group_vars/k3s_cluster contains variables which apply to the entire cluster
There are many other sub-roles, one for each application deployed onto the cluster. I will cover each one in future posts.
Pre-requisites Role
The prereq role can be expanded to include any tasks specific to the Linux distribution used. I’ve only included the Debian tasks because that what I used.
The cluster uses Longhorn for storage and I chose to create a separate virtual disk, mounted as its own filesystem, for the Longhorn data. The role initializes and mounts this filesystem.
roles/k3s_cluster/prereq/tasks/main.yml:
roles/k3s_cluster/prereq/tasks/Debian.yml:
The storage tasks configure the data disk that was provisioned with Terraform, mounts it, and prepares it to be used by Longhorn.
roles/k3s_cluster/prereq/tasks/storage.yml:
roles/k3s_cluster/download/main.yml:
roles/inventory/group_vars/k3s_cluster:
k3s Master Role
The k3s master role is different from the worker node role in that it configures keepalived and a floating IP which will provide high availability to the Kubernetes API using the embedded etcd. The first start of the k3s service on the first node needs to initialize the cluster. Any additional master nodes need to be started, for the first time, to follow the first node. After the first start, the service is reconfigured to start without the init-cluster parameter. All master nodes will receive updates to the cluster configuration and whichever node keepalived decides should serve the floating IP will handle Kubernetes API calls.
In order to differentiate between a first-time run of the role and subsequent runs, the k3s-init-cluster.yml playbook uses set_fact to set k3s_bootstrap_cluster to true.
roles/k3s_cluster/k3s/master/tasks/main.yml:
roles/k3s_cluster/k3s/master/templates/k3s-bootstrap-first.service.j2:
[Unit]
Description=Lightweight Kubernetes
Documentation=https://k3s.io
Wants=network-online.target
After=network-online.target
[Install]
WantedBy=multi-user.target
[Service]
Type=notify
EnvironmentFile=-/etc/default/%N
EnvironmentFile=-/etc/sysconfig/%N
EnvironmentFile=-/etc/systemd/system/k3s.service.env
ExecStartPre=/bin/sh -xc '! /usr/bin/systemctl is-enabled --quiet nm-cloud-setup.service'
ExecStartPre=-/sbin/modprobe br_netfilter
ExecStartPre=-/sbin/modprobe overlay
ExecStart=/usr/local/bin/k3s server --cluster-init --token {{ k3s_token }} {{ extra_server_args | default("") }}
KillMode=process
Delegate=yes
# Having non-zero Limit*s causes performance problems due to accounting overhead
# in the kernel. We recommend using cgroups to do container-local accounting.
LimitNOFILE=1048576
LimitNPROC=infinity
LimitCORE=infinity
TasksMax=infinity
TimeoutStartSec=0
Restart=always
RestartSec=5s
roles/k3s_cluster/k3s/master/templates/k3s-bootstrap-followers.service.j2:
[Unit]
Description=Lightweight Kubernetes
Documentation=https://k3s.io
Wants=network-online.target
After=network-online.target
[Install]
WantedBy=multi-user.target
[Service]
Type=notify
EnvironmentFile=-/etc/default/%N
EnvironmentFile=-/etc/sysconfig/%N
EnvironmentFile=-/etc/systemd/system/k3s.service.env
ExecStartPre=/bin/sh -xc '! /usr/bin/systemctl is-enabled --quiet nm-cloud-setup.service'
ExecStartPre=-/sbin/modprobe br_netfilter
ExecStartPre=-/sbin/modprobe overlay
ExecStart=/usr/local/bin/k3s server --server https://{{ master_ip }}:6443 --token {{ k3s_token }} {{ extra_server_args | default("") }}
KillMode=process
Delegate=yes
# Having non-zero Limit*s causes performance problems due to accounting overhead
# in the kernel. We recommend using cgroups to do container-local accounting.
LimitNOFILE=1048576
LimitNPROC=infinity
LimitCORE=infinity
TasksMax=infinity
TimeoutStartSec=0
Restart=always
RestartSec=5s
roles/k3s_cluster/k3s/master/templates/k3s-vip.j2:
auto eth0:1
iface eth0:1 inet static
address {{ master_vip }}/32
roles/k3s_cluster/k3s/master/templates/k3s.service.env.j2 is currently empty, but could be deployed in case I want to set additional environment variables for the service.
roles/k3s_cluster/k3s/master/templates/k3s.service.j2:
[Unit]
Description=Lightweight Kubernetes
Documentation=https://k3s.io
Wants=network-online.target
After=network-online.target
[Install]
WantedBy=multi-user.target
[Service]
Type=notify
EnvironmentFile=-/etc/default/%N
EnvironmentFile=-/etc/sysconfig/%N
EnvironmentFile=-/etc/systemd/system/k3s.service.env
ExecStartPre=/bin/sh -xc '! /usr/bin/systemctl is-enabled --quiet nm-cloud-setup.service'
ExecStartPre=-/sbin/modprobe br_netfilter
ExecStartPre=-/sbin/modprobe overlay
ExecStart=/usr/local/bin/k3s server {{ extra_server_args | default("") }}
KillMode=process
Delegate=yes
# Having non-zero Limit*s causes performance problems due to accounting overhead
# in the kernel. We recommend using cgroups to do container-local accounting.
LimitNOFILE=1048576
LimitNPROC=infinity
LimitCORE=infinity
TasksMax=infinity
TimeoutStartSec=0
Restart=always
RestartSec=5s
roles/k3s_cluster/k3s/master/templates/keepalive-conf.j2:
roles/k3s_cluster/k3s/master/handlers/main.yml:
k3s Worker Node Role
The worker node role is similar to the master node, but without all of the extra initialization steps. It simply starts the k3s agent using the floating IP of the cluster master.
roles/k3s_cluster/k3s/node/tasks/main.yml:
roles/k3s_cluster/k3s/node/templates/k3s-node.service.j2:
[Unit]
Description=Lightweight Kubernetes
Documentation=https://k3s.io
Wants=network-online.target
After=network-online.target
[Install]
WantedBy=multi-user.target
[Service]
Type=notify
EnvironmentFile=-/etc/default/%N
EnvironmentFile=-/etc/sysconfig/%N
EnvironmentFile=-/etc/systemd/system/k3s.service.env
ExecStartPre=/bin/sh -xc '! /usr/bin/systemctl is-enabled --quiet nm-cloud-setup.service'
ExecStartPre=-/sbin/modprobe br_netfilter
ExecStartPre=-/sbin/modprobe overlay
ExecStart=/usr/local/bin/k3s agent --server https://{{ master_vip }}:6443 --token {{ k3s_token }} {{ extra_agent_args | default("") }}
KillMode=process
Delegate=yes
# Having non-zero Limit*s causes performance problems due to accounting overhead
# in the kernel. We recommend using cgroups to do container-local accounting.
LimitNOFILE=1048576
LimitNPROC=infinity
LimitCORE=infinity
TasksMax=infinity
TimeoutStartSec=0
Restart=always
RestartSec=5s
Just like the master role, roles/k3s_cluster/k3s/node/templates/k3s-node.service.env.j2 is currently empty.
Playbooks
There are two playbooks used with this role. The k3s-init-cluster.yml playbook is run once right after the VMs are provisioned to set up the initial cluster. The k3s-cluster.yml playbook is identical except that k3s_boostrap_cluster is set to false and can be run for any subsequent changes needed to the cluster nodes. Both playbooks create a .kube directory in the ansible repository to allow command line access (kubectl and helm) from the ansible host.
k3s-init-cluster.yml:
Next steps
The last task in the playbook executes the metallb, longhorn, and traefik roles. These roles deploy and/or configure the cluster load balancer, cluster storage, and cluster ingress controller. I will cover those in detail in my next post.