Cluster Bootstrap¶
This page documents the full process of bootstrapping the cluster from scratch. All steps use just recipes defined in pitower/talos/justfile.
Prerequisites¶
Before starting, ensure the following are available:
- talosctl installed (matching Talos v1.12.4)
- kubectl installed
- just command runner installed
- sops installed and configured with the correct age key or GPG key
- kustomize installed (for building CNI addons)
- All nodes are powered on and booted into Talos maintenance mode
- Network connectivity between your workstation and all node IPs
-
secrets.sops.yamlpresent inpitower/talos/
SOPS Key Required
The secrets.sops.yaml file contains the cluster secrets (CA certs, tokens, encryption keys). You must have the corresponding SOPS decryption key (age or GPG) available in your environment.
Bootstrap Steps¶
Overview¶
flowchart TD
A[1. Decrypt Secrets] --> B[2. Generate Configs]
B --> C[3. Patch Per-Node Configs]
C --> D[4. Apply Control Plane Configs]
D --> E[5. Apply Worker Configs]
E --> F[6. Bootstrap Cluster]
F --> G[7. Get Kubeconfig]
G --> H[8. Apply CNI Addons]
H --> I[9. Verify Cluster Health] Step 1: Generate Base Machine Configs¶
This recipe:
- Decrypts
secrets.sops.yamltosecrets.yamlusing SOPS - Runs
talosctl gen configwith:- Cluster name:
home-ops - Endpoint:
https://192.168.0.200:6443 - Global patch:
patches/general.patch - Control plane patch:
patches/controlplane.patch
- Cluster name:
- Outputs
controlplane.yamlandworker.yamlintoclusterconfig/
The decrypted secrets.yaml is written to disk temporarily. It is not committed to Git.
Step 2: Apply Per-Node Patches¶
This applies node-specific patches from patches/nodes/ to create individual config files:
| Base Config | Node Patches Applied |
|---|---|
controlplane.yaml | worker-01.patch, worker-02.patch, worker-03.patch |
worker.yaml | worker-04.patch, worker-05.patch, worker-06.patch |
worker.yaml | worker-pi-01.patch, worker-pi-02.patch, worker-pi-03.patch |
Output files are written to clusterconfig/ (e.g., worker-01.yaml, worker-pi-02.yaml).
Step 3: Apply Configs to Control Plane Nodes¶
Applies the generated configs to the three control plane nodes:
| Node | IP | Config File |
|---|---|---|
| worker-01 | 192.168.0.201 | worker-01.yaml |
| worker-02 | 192.168.0.202 | worker-02.yaml |
| worker-03 | 192.168.0.203 | worker-03.yaml |
Apply Order
Control plane configs must be applied before workers. The control plane nodes form the etcd cluster that workers will join.
Step 4: Apply Configs to Worker Nodes¶
Applies configs to the worker nodes:
| Node | IP | Config File |
|---|---|---|
| worker-04 | 192.168.0.204 | worker-04.yaml |
| worker-pi-01 | 192.168.0.211 | worker-pi-01.yaml |
| worker-pi-02 | 192.168.0.212 | worker-pi-02.yaml |
| worker-pi-03 | 192.168.0.213 | worker-pi-03.yaml |
Step 5: Bootstrap the Cluster¶
This recipe:
- Bootstraps etcd on the first control plane node (
192.168.0.201) - Retrieves kubeconfig from the VIP (
192.168.0.200)
Bootstrap is a one-time operation
The bootstrap command should only be run once during initial cluster creation. Running it again on an existing cluster can corrupt etcd.
Step 6: Verify Talos Node Health¶
Before applying addons, confirm all nodes are up:
Expected output should show all nodes with their IPs and roles.
Post-Bootstrap¶
Apply CNI and Addons¶
The cluster starts with no CNI (cni.name: none) and no kube-proxy (proxy.disabled: true). Cilium and the kubelet CSR approver must be installed as the first addons:
This runs:
The addons kustomization includes:
| Addon | Namespace | Purpose |
|---|---|---|
| Cilium (v1.18.7) | kube-system | CNI, kube-proxy replacement, network policies |
| kubelet-csr-approver (v1.2.13) | system-controllers | Auto-approves kubelet certificate signing requests |
Cilium Must Be Applied First
Pods will remain in Pending state until Cilium is installed because there is no CNI to assign pod IPs. The just addons step is critical and must be run immediately after bootstrap.
Verify Cluster Health¶
After addons are applied, verify the cluster is healthy:
All Cilium agent pods should be Running.
Core system pods (kube-apiserver, kube-controller-manager, kube-scheduler, etcd) should be Running on control plane nodes.
Troubleshooting¶
Nodes not appearing after bootstrap
- Verify network connectivity:
talosctl get addresses -n <node-ip> - Check if the node received its config:
talosctl get machineconfig -n <node-ip> - Ensure the bootstrap node (192.168.0.201) completed etcd initialization
Pods stuck in Pending after bootstrap
This is expected until Cilium is installed. Run just addons to install the CNI.
Certificate errors after bootstrap
The kubelet CSR approver handles automatic certificate approval. If nodes show certificate errors:
Once the kubelet-csr-approver is running, future CSRs are approved automatically.
SOPS decryption fails
Ensure your SOPS key is available:
- age: Set
SOPS_AGE_KEY_FILEenvironment variable - GPG: Ensure the GPG key is in your keyring