diff --git a/README.md b/README.md index 6432cac..c8828f4 100644 --- a/README.md +++ b/README.md @@ -2,101 +2,121 @@ Infrastructure-as-Code for a 3-machine homelab running K3s. -## Status +## Architecture + +| Machine | IP | Role | +|---|---|---| +| Minisforum UM780 XTX | `192.168.7.77` | K3s control-plane | +| nik-debian (HP ProDesk) | `192.168.7.183` | K3s storage agent | +| Mac Mini M2 | `192.168.7.96` | Standalone Docker (outside cluster) | + +## Roadmap | Phase | Description | Status | |---|---|---| -| 0 | Backup configs, init repo | ✅ Done | -| 1 | Bootstrap Minisforum — K3s server + Traefik | ✅ Done | -| 2 | Join Debian as K3s agent, SMB setup | ⏳ Next | -| 3 | Deploy core infra — Gitea, Pi-hole, DDNS | 🔧 In progress | -| 4 | Deploy app services — Jellyfin, qBittorrent, JDownloader, Dashy, Glances | 🔜 Planned | -| 5 | Networking cutover — router, Traefik ingress, DNS | 🔜 Planned | -| 6 | Cleanup legacy Debian services | 🔜 Planned | +| P0 | CA web installer (`ca.home.arpa`) | ✅ Done | +| P1 | Prometheus + Grafana + Loki | ✅ Done | +| P2 | Authentik SSO | ✅ Done | +| P3 | Photoview | ✅ Done | +| P4 | Home Assistant | 🔜 Planned | +| P5 | Portfolio site (`nik4nao.com`) | ✅ Done | +| P6 | WireGuard split tunnel | 🔜 Planned | -## Architecture +## Live Services -| Machine | IP | SSH Port | Role | Status | -|---|---|---|---|---| -| Minisforum UM780 XTX | 192.168.7.77 | 430 | K3s server, main gateway | ✅ Running — K3s + Traefik | -| Debian Server (HP ProDesk) | TBD | — | K3s agent, SMB storage | ⏳ Phase 2 | -| Mac Mini M2 | TBD | — | Standalone (outside cluster) | ⏳ Phase 3+ | +| Service | URL | TLS | Notes | +|---|---|---|---| +| Traefik dashboard | `https://traefik.home.arpa` | Internal CA | Protected by Authentik | +| Authentik | `https://auth.home.arpa` / `https://auth.nik4nao.com` | Internal CA / Let's Encrypt | SSO for all services | +| Gitea | `https://gitea.nik4nao.com` | Let's Encrypt | Git + Docker registry | +| Pi-hole (primary) | `https://pihole.home.arpa` | Internal CA | DNS, runs on Minisforum | +| Pi-hole (secondary) | — | — | externalIPs on Debian (`192.168.7.183`) | +| Grafana | `https://grafana.nik4nao.com` | Let's Encrypt | Protected by Authentik OIDC | +| Prometheus | internal | — | kube-prometheus-stack | +| Loki + Promtail | internal | — | Log aggregation | +| Jellyfin | `https://jellyfin.home.arpa` | Internal CA | Media server, NFS storage | +| qBittorrent | `https://qbittorrent.home.arpa` | Internal CA | `/mnt/storage/torrents` | +| JDownloader | `https://jdownloader.home.arpa` | Internal CA | `/mnt/storage/dl` | +| Dashy | `https://dashy.home.arpa` | Internal CA | Config via ConfigMap | +| Glances | `https://glances.home.arpa` | Internal CA | DaemonSet + Debian Docker | +| Photoview | `https://photoview.home.arpa` | Internal CA | NFS photo gallery | +| Watch Party | `https://watch-party.nik4nao.com` | Let's Encrypt | Mac Mini, CI/CD deployed | +| Portfolio | `https://nik4nao.com` | Let's Encrypt | Hugo + terminal theme | +| CA installer | `http://ca.home.arpa` | — | Internal CA cert download page | -## Internal Services (Minisforum) +## Auth -| Service | URL | Notes | -|---|---|---| -| Traefik | — | Ingress controller, Let's Encrypt | -| Authentik | `https://authentik.home.arpa` | SSO/identity provider | -| Gitea | `https://gitea.home.arpa` | Git + Docker registry, SSH on port 2222 | -| Pi-hole | `https://pihole.home.arpa/admin` | Primary DNS, resolves `*.home.arpa` → 192.168.7.77 | -| Grafana | `https://grafana.home.arpa` | Monitoring dashboards (kube-prometheus-stack) | -| Jellyfin | `https://jellyfin.home.arpa` | Media server | -| qBittorrent | `https://qbittorrent.home.arpa` | Torrent client | -| JDownloader | `https://jdownloader.home.arpa` | Download manager | -| Dashy | `https://dashy.home.arpa` | Dashboard | -| Glances | `https://glances.home.arpa` | System monitoring | -| Photoview | `https://photoview.home.arpa` | Photo gallery | -| Portfolio | `https://nik4nao.com` | Public portfolio site | +- **SSO:** Authentik at `auth.home.arpa` (internal) / `auth.nik4nao.com` (public) +- **Protected services:** Traefik dashboard, Grafana (OIDC), Gitea (OIDC) +- **MFA:** TOTP enforced, 8hr sessions +- **Users:** `nik` (admin), `akadmin` (break-glass) +- **Gitea:** local login disabled, OIDC only + +## TLS + +- **`*.home.arpa`** — internal self-signed CA (cert-manager). Install CA cert via `http://ca.home.arpa` +- **`*.nik4nao.com`** — Let's Encrypt via HTTP-01 (Traefik) ## Repo Structure ``` ansible/ - inventory.yaml # host definitions + inventory.yaml playbooks/ bootstrap-minisforum.yaml # OS hardening, packages, UFW, /data dirs - deploy-watch-party.yaml # deploy watch-party app + deploy-watch-party.yaml # deploy watch-party app on Mac Mini join-debian-agent.yaml # join Debian as K3s agent - setup-gitea-runner.yaml # set up Gitea Actions runner - setup-glances-debian.yaml # deploy Glances on Debian host + setup-gitea-runner.yaml # Gitea Actions runner (act_runner systemd) + setup-glances-debian.yaml # Glances on Debian host setup-k3s.yaml # K3s server install, Helm, kubeconfig - setup-monitoring.yaml # deploy monitoring stack - setup-nfs-debian.yaml # configure NFS server on Debian + setup-monitoring.yaml # Prometheus + Grafana + Loki stack + setup-nfs-debian.yaml # NFS server on Debian roles/ common/ # user, SSH hardening, UFW, base packages - gitea-runner/ # Gitea Actions runner setup + gitea-runner/ # act_runner v0.2.11 systemd service glances/ # Glances system monitor k3s-agent/ # K3s agent node join k3s-server/ # K3s server install + Helm - monitoring/ # Prometheus/Grafana monitoring + monitoring/ # Prometheus/Grafana stack nfs-server/ # NFS server configuration - watch-party/ # Watch-party app deployment + watch-party/ # Watch Party Docker Compose on Mac Mini config/ - dashy/conf.yaml # Dashy dashboard config + dashy/conf.yaml # Dashy dashboard config (applied via ConfigMap) manifests/ - authentik/ # Authentik ingress, middleware, proxy outpost, secrets - cert-manager/ # ClusterIssuers and porkbun-secret.sh - core/ # Dashy, Glances, CA installer, CoreDNS config, apply-dashy-config.sh - gitea/ # Gitea PV, runner, backup, public ingress, runner secret + authentik/ # Ingress, middleware, proxy outpost, secrets + cert-manager/ # ClusterIssuers, porkbun-secret.sh + core/ # Dashy, Glances, CA installer, CoreDNS config + gitea/ # PV, runner, backup CronJob, public ingress media/ # Jellyfin, qBittorrent, JDownloader, Photoview - monitoring/ # Grafana/Loki datasource, PVs, grafana-secret.sh - network/ # DDNS, Traefik dashboard, ingress routes, pihole patch + monitoring/ # Grafana/Loki datasource ConfigMap, PVs, grafana-secret.sh + network/ # DDNS CronJob, Traefik dashboard, pihole-debian-patch.sh portfolio/ # Portfolio deployment, registry pull secret values/ - authentik.yaml # Authentik SSO - cert-manager.yaml # cert-manager - gitea.yaml # Gitea - kube-prometheus-stack.yaml # Prometheus + Grafana - loki-stack.yaml # Loki log aggregation - pihole.yaml # Pi-hole (Minisforum) - pihole-debian.yaml # Pi-hole (Debian) - traefik.yaml # Traefik ingress controller + authentik.yaml + cert-manager.yaml + gitea.yaml + kube-prometheus-stack.yaml + loki-stack.yaml + pihole.yaml + pihole-debian.yaml + traefik.yaml ``` ## Prerequisites -- Ansible installed on your workstation: `pip install ansible` +- Ansible on workstation: `pip install ansible` - Ansible collections: `ansible-galaxy collection install community.general ansible.posix` -- SSH key at `~/.ssh/id_ed25519-nik-macbookair` +- SSH key: `~/.ssh/id_ed25519-nik-macbookair` +- kubectl + helm installed ## Connecting ```bash # SSH -ssh minisforum # port 430, configured via ~/.ssh/config +ssh minisforum # port 430, via ~/.ssh/config +ssh nik-debian # port 22 -# Kubectl (after fetching kubeconfig) +# Kubectl export KUBECONFIG=/tmp/k3s-minisforum.yaml kubectl get nodes kubectl get pods -A @@ -104,53 +124,98 @@ kubectl get pods -A ## Deploying / Re-deploying +### Ansible (host-level) + ```bash -# Re-run bootstrap (idempotent) +# Bootstrap Minisforum ansible-playbook -i ansible/inventory.yaml ansible/playbooks/bootstrap-minisforum.yaml -# Re-run K3s setup (idempotent) +# K3s server ansible-playbook -i ansible/inventory.yaml ansible/playbooks/setup-k3s.yaml +# Join Debian as agent +ansible-playbook -i ansible/inventory.yaml ansible/playbooks/join-debian-agent.yaml + +# NFS on Debian +ansible-playbook -i ansible/inventory.yaml ansible/playbooks/setup-nfs-debian.yaml + +# Gitea Actions runner +ansible-playbook -i ansible/inventory.yaml ansible/playbooks/setup-gitea-runner.yaml + +# Glances on Debian +ansible-playbook -i ansible/inventory.yaml ansible/playbooks/setup-glances-debian.yaml + +# Watch Party on Mac Mini +ansible-playbook -i ansible/inventory.yaml ansible/playbooks/deploy-watch-party.yaml +``` + +### Helm (cluster services) + +```bash # Traefik helm repo add traefik https://helm.traefik.io/traefik && helm repo update helm upgrade --install traefik traefik/traefik \ - --namespace traefik --create-namespace \ - -f values/traefik.yaml - -# Gitea -helm repo add gitea-charts https://dl.gitea.com/charts/ && helm repo update -helm upgrade --install gitea gitea-charts/gitea \ - --namespace gitea --create-namespace \ - -f values/gitea.yaml - -# Pi-hole -helm repo add mojo2600 https://mojo2600.github.io/pihole-kubernetes/ && helm repo update -helm upgrade --install pihole mojo2600/pihole \ - --namespace pihole --create-namespace \ - -f values/pihole.yaml + --namespace traefik --create-namespace -f values/traefik.yaml # cert-manager helm repo add jetstack https://charts.jetstack.io && helm repo update helm upgrade --install cert-manager jetstack/cert-manager \ - --namespace cert-manager --create-namespace \ - -f values/cert-manager.yaml + --namespace cert-manager --create-namespace -f values/cert-manager.yaml + +# Gitea +helm repo add gitea-charts https://dl.gitea.com/charts/ && helm repo update +helm upgrade --install gitea gitea-charts/gitea \ + --namespace gitea --create-namespace -f values/gitea.yaml + +# Pi-hole (Minisforum) +helm repo add mojo2600 https://mojo2600.github.io/pihole-kubernetes/ && helm repo update +helm upgrade --install pihole mojo2600/pihole \ + --namespace pihole --create-namespace -f values/pihole.yaml +# Note: re-run manifests/network/pihole-debian-patch.sh after every Pi-hole upgrade +# (externalIPs for Debian secondary are lost on upgrade) # Authentik helm repo add authentik https://charts.goauthentik.io && helm repo update helm upgrade --install authentik authentik/authentik \ - --namespace authentik --create-namespace \ - -f values/authentik.yaml + --namespace authentik --create-namespace -f values/authentik.yaml # kube-prometheus-stack helm repo add prometheus-community https://prometheus-community.github.io/helm-charts && helm repo update helm upgrade --install kube-prometheus-stack prometheus-community/kube-prometheus-stack \ - --namespace monitoring --create-namespace \ - -f values/kube-prometheus-stack.yaml + --namespace monitoring --create-namespace -f values/kube-prometheus-stack.yaml # Loki helm repo add grafana https://grafana.github.io/helm-charts && helm repo update helm upgrade --install loki grafana/loki-stack \ - --namespace monitoring --create-namespace \ - -f values/loki-stack.yaml + --namespace monitoring --create-namespace -f values/loki-stack.yaml ``` +### Secrets (create before applying manifests) + +```bash +# Porkbun API (cert-manager DNS-01) +bash manifests/cert-manager/porkbun-secret.sh + +# Grafana admin password +bash manifests/monitoring/grafana-secret.sh + +# DDNS credentials +bash manifests/network/ddns-secret.sh + +# Gitea Actions runner token +bash manifests/gitea/runner-secret.sh +``` + +## Known Gotchas + +- **Gitea ROOT_URL:** changing `ROOT_URL` in `values/gitea.yaml` is not enough — must also delete the `gitea-inline-config` secret and re-run `helm upgrade`. Disable the built-in ingress (`ingress.enabled=false`) and use the manual IngressRoute in `manifests/gitea/`. +- **Pihole secondary externalIPs:** lost on every Helm upgrade — re-run `manifests/network/pihole-debian-patch.sh` after each upgrade. +- **Prometheus hostPath:** `/data/prometheus` requires `chmod -R 777` (owned by UID 65534). +- **Grafana PVC:** use `local-path` dynamic provisioning — do not use a static hostPath PV, K3s overrides `storageClassName: ""`. +- **Loki datasource:** provisioned via labeled ConfigMap (`grafana_datasource`), not the Grafana UI — the bundled plugin in loki-stack v2.9.3 is incompatible with Grafana 12. +- **Authentik forwardAuth:** `Cookie` header must be in `authRequestHeaders` in the Traefik middleware or you get an infinite redirect loop after login. +- **Traefik v3 `api@internal`:** requires both `traefik` and `websecure` entrypoints in the IngressRoute, otherwise 404. +- **CoreDNS custom config:** use `.server` suffix for zone blocks. `.override` suffix cannot contain `zone {}` syntax — crashes CoreDNS. +- **Photoview video:** `PHOTOVIEW_DISABLE_VIDEO=true` only takes effect on a fresh scan — delete the SQLite DB and restart before rescanning. +- **CI/CD Buildkit CA:** internal CA must be injected into the `buildx_buildkit_multiarch0` container on every CI run (does not persist across restarts). +- **Pihole DNS:** no wildcard support — every new `home.arpa` subdomain needs an explicit entry in `values/pihole.yaml`.