Infrastructure Deployment
Step-by-Step Infrastructure Setup
This guide walks through deploying the bare metal infrastructure, from Raspberry Pi OS installation to a running K3s cluster.
Prerequisites
Hardware Setup
- 5× Raspberry Pi CM5 blades (8GB RAM) installed in chassis
- 5× NVMe SSDs (M.2 2242) installed in blades
- MikroTik router connected to internet
- Network switch connected to MikroTik
- All blades connected to switch (or direct to MikroTik)
Management Workstation
- Ansible 2.9+ installed
- kubectl 1.28+ installed
- SSH client
- Network access to Homelab network (192.168.77.0/24)
Step 1: Raspberry Pi OS Bootstrap
1.1 Download Ubuntu Server 24.04 LTS ARM64
# Download official Ubuntu Server image
wget https://cdimage.ubuntu.com/releases/24.04/release/ubuntu-24.04-preinstalled-server-arm64+raspi.img.xz
# Verify checksum (optional)
wget https://cdimage.ubuntu.com/releases/24.04/release/SHA256SUMS
sha256sum -c SHA256SUMS 2>&1 | grep OK1.2 Flash SD Card (for initial boot)
# Extract image
xz -d ubuntu-24.04-preinstalled-server-arm64+raspi.img.xz
# Flash to SD card (replace /dev/sdX with your SD card device)
sudo dd if=ubuntu-24.04-preinstalled-server-arm64+raspi.img of=/dev/sdX bs=4M status=progress conv=fsync
# Mount and configure cloud-init (optional: set hostname, SSH keys)
# Mount the writable partition
sudo mount /dev/sdX2 /mnt
sudo nano /mnt/etc/cloud/cloud.cfg.d/99-custom.cfgCustom cloud-init config (99-custom.cfg):
#cloud-config
hostname: blade001
fqdn: blade001.zengarden.space
manage_etc_hosts: true
users:
- name: ubuntu
ssh_authorized_keys:
- ssh-rsa AAAAB3... your-public-key
sudo: ALL=(ALL) NOPASSWD:ALL
shell: /bin/bash
# Disable password authentication
ssh_pwauth: false1.3 First Boot and Initial Setup
Repeat for each blade (blade001-blade005):
- Insert SD card into CM5
- Power on blade
- Wait ~2-3 minutes for boot
- SSH into blade using default credentials or SSH key
# Default credentials (if no cloud-init):
# username: ubuntu
# password: ubuntu (change on first login)
ssh ubuntu@<blade-ip>
# Change hostname (if not set via cloud-init)
sudo hostnamectl set-hostname blade001
# Update system
sudo apt update && sudo apt upgrade -y
# Install required packages
sudo apt install -y python3 python3-pip curl wget vim
# Set static IP (edit netplan)
sudo nano /etc/netplan/50-cloud-init.yamlNetplan configuration (50-cloud-init.yaml):
network:
version: 2
ethernets:
eth0:
addresses:
- 192.168.77.170/24 # blade001
gateway4: 192.168.77.1
nameservers:
addresses:
- 192.168.77.1
- 8.8.8.8
dhcp4: falseIP Address Assignment:
- blade001: 192.168.77.170
- blade002: 192.168.77.171
- blade003: 192.168.77.172
- blade004: 192.168.77.173
- blade005: 192.168.77.174
# Apply netplan changes
sudo netplan apply
# Reboot to ensure all changes take effect
sudo rebootStep 2: MikroTik Router Configuration
2.1 Connect to MikroTik WebUI or CLI
# SSH to MikroTik
ssh [email protected]
# Default password: <empty> or <admin>2.2 Configure Bridges
Create bridge interfaces for network isolation:
# Create bridges (if not already present)
/interface bridge
add name=bridge-home comment="Home Network - 192.168.88.0/24"
add name=bridge-homelab comment="Homelab K3s Cluster - 192.168.77.0/24"
# Add physical interfaces to bridges
/interface bridge port
add interface=ether2 bridge=bridge-homelab hw=yes comment="Zyxel switch → blade001-004"
add interface=ether3 bridge=bridge-homelab hw=yes comment="Direct to blade005"
add interface=ether4 bridge=bridge-home hw=yes
add interface=ether5 bridge=bridge-home hw=yes comment="2.5Gbps port"
# WiFi interfaces (if using MikroTik WiFi)
# /interface bridge port
# add interface=wlan1 bridge=bridge-home comment="2.4GHz WiFi"
# add interface=wlan2 bridge=bridge-home comment="5GHz WiFi"Note: Hardware offload (hw=yes) is enabled on all bridge ports for performance.
2.3 Configure IP Addresses
/ip address
add address=192.168.77.1/24 interface=bridge-homelab comment="Homelab gateway"
add address=192.168.88.1/24 interface=bridge-home comment="Home gateway"2.4 Configure DHCP for Homelab
# DHCP pool for K3s cluster blades
/ip pool
add name=pool-homelab ranges=192.168.77.170-192.168.77.199
# DHCP server for Homelab network
/ip dhcp-server
add name=dhcp-homelab interface=bridge-homelab address-pool=pool-homelab lease-time=30m
# DHCP network configuration
/ip dhcp-server network
add address=192.168.77.0/24 gateway=192.168.77.1 dns-server=192.168.77.1 domain=homelab.int.zengarden.space
# DHCP server for Home network (if needed)
/ip dhcp-server
add name=defconf interface=bridge-home lease-time=30m
/ip dhcp-server network
add address=192.168.88.0/24 gateway=192.168.88.1 dns-server=192.168.88.1 domain=zengarden.spaceExpected DHCP leases (Homelab):
- blade001: 192.168.77.170
- blade002: 192.168.77.171
- blade003: 192.168.77.172
- blade004: 192.168.77.173
- blade005: 192.168.77.174
2.5 Configure DNS
/ip dns
set servers=8.8.8.8,1.1.1.1
set allow-remote-requests=yes
# Add static DNS entries for blades
/ip dns static
add name=blade001.zengarden.space address=192.168.77.170
add name=blade002.zengarden.space address=192.168.77.171
add name=blade003.zengarden.space address=192.168.77.172
add name=blade004.zengarden.space address=192.168.77.173
add name=blade005.zengarden.space address=192.168.77.1742.6 Configure Interface Lists
Create interface lists for firewall rules:
/interface list
add name=LAN comment="Local networks"
add name=WAN comment="Internet-facing interfaces"
/interface list member
add list=LAN interface=bridge-home
add list=LAN interface=bridge-homelab
add list=WAN interface=pppoe-out12.7 Configure Firewall
Refer to Security documentation for complete firewall rules.
Network isolation firewall rules:
/ip firewall filter
# Fasttrack established/related connections (hardware offload)
add action=fasttrack-connection chain=forward connection-state=established,related hw-offload=yes comment="Fasttrack established/related"
add action=accept chain=forward connection-state=established,related,untracked comment="Accept established/related"
add action=drop chain=forward connection-state=invalid comment="Drop invalid"
# Drop WAN → LAN if not DSTNATed
add action=drop chain=forward connection-state=new connection-nat-state=!dstnat in-interface-list=WAN comment="Drop WAN → LAN not DSTNATed"
# Network segmentation rules
add action=accept chain=forward src-address=192.168.88.0/24 dst-address=192.168.77.0/24 protocol=tcp dst-port=22,80,443 comment="Allow Home → Homelab (SSH,HTTP,HTTPS)"
add action=drop chain=forward dst-address=192.168.77.0/24 src-address=192.168.88.0/24 comment="Block Home → Homelab (other)"
add action=drop chain=forward src-address=192.168.77.0/24 dst-address=192.168.88.0/24 comment="Block Homelab → Home"
# Allow LAN → Internet
add action=accept chain=forward src-address=192.168.77.0/24 out-interface-list=WAN comment="Allow Homelab → Internet"
add action=accept chain=forward src-address=192.168.88.0/24 out-interface-list=WAN comment="Allow Home → Internet"
# Default drop
add action=drop chain=forward comment="Drop all remaining"Input chain protection:
/ip firewall filter
add action=accept chain=input connection-state=established,related,untracked comment="Accept established"
add action=drop chain=input connection-state=invalid comment="Drop invalid"
add action=accept chain=input protocol=icmp comment="Accept ICMP"
add action=accept chain=input in-interface-list=LAN comment="Accept from LAN"
add action=drop chain=input comment="Drop all else"2.8 Configure NAT
/ip firewall nat
add action=masquerade chain=srcnat out-interface-list=WAN comment="Masquerade to Internet"Step 3: Google OAuth Credentials
3.1 Create Google Cloud Project
- Navigate to Google Cloud Console
- Create new project: “homelab-k3s”
- Enable Google+ API (for OIDC)
3.2 Create OAuth 2.0 Client ID
-
Navigate to APIs & Services > Credentials
-
Click Create Credentials > OAuth 2.0 Client ID
-
Application type: Web application
-
Name: “K3s OIDC”
-
Authorized redirect URIs:
http://localhost:8000http://localhost:18000
-
Click Create
-
Save Client ID and Client Secret
3.3 Store Credentials
Create .env file for Ansible:
cd ../ansible/install-k3s
cp .env.template .env
nano .env.env file:
# Google OIDC Credentials
GOOGLE_CLIENT_ID="<your-client-id>.apps.googleusercontent.com"
GOOGLE_CLIENT_SECRET="<your-client-secret>"
GOOGLE_ADMIN_EMAIL="[email protected]"Step 4: SSH Key Distribution
4.1 Generate SSH Key (if not exists)
ssh-keygen -t ed25519 -C "homelab-admin"
# Save to: ~/.ssh/id_ed25519_homelab4.2 Copy SSH Key to All Blades
# Copy public key to each blade
for i in {11..15}; do
ssh-copy-id -i ~/.ssh/id_ed25519_homelab.pub [email protected].$i
done
# Test SSH access (should not prompt for password)
ssh -i ~/.ssh/id_ed25519_homelab [email protected]4.3 Run Ansible Playbook (Optional)
Alternatively, use Ansible playbook:
cd ../ansible/setup-ssh-keys
ansible-playbook -i ../install-k3s/hosts.yaml setup.yamlStep 5: NVMe Drive Setup
5.1 Verify NVMe Drives
SSH to each blade and check NVMe:
ssh [email protected]
lsblkExpected output:
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
nvme0n1 259:0 0 238.5G 0 disk5.2 Format and Mount NVMe
The NVMe drive should be formatted and mounted at /var/lib/rancher/k3s:
# Format NVMe drive (if not already formatted)
sudo mkfs.ext4 /dev/nvme0n1
# Create mount point
sudo mkdir -p /var/lib/rancher/k3s
# Add to /etc/fstab for persistent mounting
echo "/dev/nvme0n1 /var/lib/rancher/k3s ext4 defaults 0 0" | sudo tee -a /etc/fstab
# Mount
sudo mount -a
# Verify
df -h | grep nvmeExpected output:
/dev/nvme0n1 234G 15G 208G 7% /var/lib/rancher/k3sNote: K3s will use this mount for:
- etcd data (
/var/lib/rancher/k3s/server/db) - Containerd data (
/var/lib/rancher/k3s/agent/containerd) - local-path PVCs (
/var/lib/rancher/k3s/storage)
Step 6: K3s Cluster Installation
6.1 Configure Ansible Inventory
Edit hosts.yaml:
cd ../ansible/install-k3s
nano hosts.yamlhosts.yaml:
all:
children:
k3s_cluster:
children:
masters:
hosts:
blade001:
ansible_host: 192.168.77.170
blade002:
ansible_host: 192.168.77.171
blade003:
ansible_host: 192.168.77.172
workers:
hosts:
blade004:
ansible_host: 192.168.77.173
blade005:
ansible_host: 192.168.77.174
vars:
ansible_user: ubuntu
ansible_ssh_private_key_file: ~/.ssh/id_ed25519_homelab
k3s_version: v1.32.4+k3s16.2 Review Ansible Playbook
Key configuration in install.yaml:
- K3s version: v1.32.4+k3s1
- CNI: Cilium (flannel disabled)
- Pod Security Admission: Restricted
- Secrets encryption: Enabled
- OIDC: Google integration
- Audit logging: Enabled
6.3 Run K3s Installation Playbook
# Dry run (check what will be done)
ansible-playbook -i hosts.yaml install.yaml --check
# Actual installation
ansible-playbook -i hosts.yaml install.yamlInstallation steps:
- Install K3s on blade001 (master, bootstrap)
- Retrieve K3s token from blade001
- Install K3s on blade002-003 (masters, join cluster)
- Install K3s on blade004-005 (workers, join cluster)
- Install Cilium CNI
- Configure OIDC for kubectl
Duration: ~20-30 minutes
6.4 Verify K3s Installation
# Copy kubeconfig to management workstation
scp [email protected]:~/.kube/config ~/.kube/config-homelab
# Set KUBECONFIG
export KUBECONFIG=~/.kube/config-homelab
# Verify nodes
kubectl get nodesExpected output:
NAME STATUS ROLES AGE VERSION
blade001 Ready control-plane,etcd,master 5m v1.32.4+k3s1
blade002 Ready control-plane,etcd,master 4m v1.32.4+k3s1
blade003 Ready control-plane,etcd,master 4m v1.32.4+k3s1
blade004 Ready <none> 3m v1.32.4+k3s1
blade005 Ready <none> 3m v1.32.4+k3s1Verify Cilium:
kubectl -n kube-system get pods -l k8s-app=ciliumExpected output:
NAME READY STATUS RESTARTS AGE
cilium-xxxxx 1/1 Running 0 5m
cilium-yyyyy 1/1 Running 0 5m
cilium-zzzzz 1/1 Running 0 5m
cilium-aaaaa 1/1 Running 0 5m
cilium-bbbbb 1/1 Running 0 5m6.5 Install kubectl-oidc-login Plugin
For OIDC authentication:
# Install krew (kubectl plugin manager)
(
set -x; cd "$(mktemp -d)" &&
OS="$(uname | tr '[:upper:]' '[:lower:]')" &&
ARCH="$(uname -m | sed -e 's/x86_64/amd64/' -e 's/\(arm\)\(64\)\?.*/\1\2/' -e 's/aarch64$/arm64/')" &&
KREW="krew-${OS}_${ARCH}" &&
curl -fsSLO "https://github.com/kubernetes-sigs/krew/releases/latest/download/${KREW}.tar.gz" &&
tar zxvf "${KREW}.tar.gz" &&
./"${KREW}" install krew
)
# Add krew to PATH
export PATH="${KREW_ROOT:-$HOME/.krew}/bin:$PATH"
# Install oidc-login plugin
kubectl krew install oidc-loginTest OIDC authentication:
# This will open browser for Google login
kubectl get pods --all-namespaces
# First time: authenticate via browser
# Subsequent: uses cached tokensStep 7: Restrictive HTTP Proxy Deployment
7.1 Configure Proxy Settings
cd ../ansible/install-restrictive-proxy
cp .env.template .env
nano .env.env file:
# MikroTik Admin Credentials
MIKROTIK_HOST="192.168.77.1"
MIKROTIK_USERNAME="admin"
MIKROTIK_PASSWORD="<your-mikrotik-password>"
# Proxy Settings
PROXY_HOST="0.0.0.0"
PROXY_PORT="555"
PROXY_MODE="RESTRICT" # or "WATCH" for testing7.2 Deploy Proxy via Ansible
ansible-playbook -i ../install-k3s/hosts.yaml install.yamlWhat this does:
- Installs Node.js on blade001-002
- Copies
server.jsand config - Creates systemd service
- Starts and enables service
7.3 Verify Proxy
# SSH to blade001
ssh [email protected]
# Check service status
sudo systemctl status restrictive-proxy
# Test proxy (should allow DNS queries)
curl -X GET http://localhost:555/rest/ip/dns/static \
-H "Authorization: Basic <base64-encoded-credentials>"
# Test denied path (should return 403)
curl -X GET http://localhost:555/rest/system/user \
-H "Authorization: Basic <base64-encoded-credentials>"Post-Installation Verification
Verify Cluster Health
# Check node status
kubectl get nodes -o wide
# Check system pods
kubectl get pods --all-namespaces
# Check etcd health
sudo k3s kubectl -n kube-system exec -it $(kubectl -n kube-system get pod -l component=etcd -o jsonpath='{.items[0].metadata.name}') -- etcdctl endpoint health
# Check Cilium health
kubectl -n kube-system exec -it $(kubectl -n kube-system get pod -l k8s-app=cilium -o jsonpath='{.items[0].metadata.name}') -- cilium statusVerify Storage
# Check NVMe mounts
ssh [email protected] df -h | grep nvmeVerify Network
# Ping between nodes
ssh [email protected] ping -c 3 192.168.77.171
# Ping gateway
ssh [email protected] ping -c 3 192.168.77.1
# Ping internet
ssh [email protected] ping -c 3 8.8.8.8Troubleshooting
K3s Installation Fails
Error: Port 6443 already in use
# Check what's using port 6443
sudo netstat -tulpn | grep 6443
# Kill process or remove old K3s installation
sudo /usr/local/bin/k3s-killall.sh
sudo /usr/local/bin/k3s-uninstall.shError: Insufficient disk space
# Check disk space
df -h
# Clean up if needed
sudo apt clean
sudo apt autoremoveCilium Pods Not Running
Check Cilium installation:
kubectl -n kube-system get pods -l k8s-app=cilium
# Check logs
kubectl -n kube-system logs -l k8s-app=cilium --tail=50
# Reinstall Cilium (if needed)
# Refer to Ansible playbook for correct Cilium helm valuesOIDC Authentication Fails
Error: OIDC discovery failed
# Verify Google OAuth credentials
cat ../ansible/install-k3s/.env
# Check K3s API server logs
sudo journalctl -u k3s -f | grep oidc
# Verify redirect URIs in Google Cloud Console match:
# http://localhost:8000
# http://localhost:18000Next Steps
Infrastructure deployed! Continue to:
- Core Platform Deployment - Deploy Helmfile-based platform services
- Applications Deployment - Set up GitOps workflow
Infrastructure deployment complete. Your K3s cluster is now ready for platform services!