Kubernetes
Cluster Operations
Managed K8s (EKS/GKE), Upgrades, and Node Pools.
Kubernetes Clusters
Kubernetes is the Operating System of the cloud. But running the OS involves two parts: the Control Plane (The Master) and the Data Plane (The Workers).
Managed vs. Self-Hosted
- Hard Mode (Kops/Kubeadm): You manage the Control Plane (etcd, api-server) on EC2 instances. Don't do this unless you are a bank or a massive tech corp.
- Easy Mode (EKS/GKE/AKS): AWS/Google manages the Control Plane. You just manage the Worker Nodes.
Node Pools (Data Plane)
Organize your worker nodes into groups based on hardware.
- General Purpose: M5/T3 instances. For web apps.
- Compute Optimized: C5. For batch processing.
- Spot Instances: Cheap, unreliable instances. Great for stateless workers, but you must handle interruptions (Graceful Shutdown).
Upgrade Strategy
Upgrading Kubernetes is terrifying. It forces a restart of every container in your cluster.
The Blue/Green Node Strategy:
- You are on Version 1.26 (Node Group A).
- Spin up a new Node Group B on Version 1.27.
- Taint Group A so no new pods land there.
- Drain Group A. This evicts pods, forcing them to reschedule onto Group B.
- Once empty, delete Group A.
Pod Disruption Budgets (PDB)
When draining nodes, K8s might kill all your API replicas at once to move them.
Define a PDB (minAvailable: 1) to tell K8s: "You can move me, but ensure at least 1 replica is always running."