Module Review
You’ve mastered the art of Kubernetes Scaling. From pod-level resizing to cluster-wide expansion, let’s solidify these concepts.
Key Takeaways
- HPA Scales Out: It adds replicas based on
utilization = current / request. It needsresources.requeststo be set. - VPA Scales Up: It changes pod
requestsbased on historical usage. It requires pod restart (unlessupdateMode: Off). - CA Scales Nodes: It reacts to Pending Pods, not high CPU. It adds nodes when pods can’t schedule.
- Metrics Server is Critical: It’s the source of truth for HPA/VPA. It holds no history (last 60s only).
-
DaemonSets are Unique: They bypass the scheduler’s replica count logic to ensure one pod per node.
Module Review: Scaling & Operations
[!NOTE] This module explores the core principles of Module Review: Scaling & Operations, deriving solutions from first principles and hardware constraints to build world-class, production-ready expertise.
1. Flashcards
Test your recall. Click to flip.
What triggers the Cluster Autoscaler?
(Click to reveal)
Pending Pods
CA only scales up when a pod cannot be scheduled due to insufficient resources. It does NOT scale on high CPU usage alone.
Why is HPA + VPA on CPU dangerous?
Feedback Loop
HPA adds pods on high CPU. VPA increases requests on high CPU. This leads to larger pods AND more replicas, wasting resources.
What happens if `resources.requests` is missing?
HPA Fails
HPA cannot calculate utilization percentage without a request value. CPU scaling will not work.
Does Metrics Server store history?
No
It only stores the latest scrape (window of ~60s). For history, you need Prometheus.
How do you run a pod on the Master node?
Tolerations
You must add a `toleration` for the `node-role.kubernetes.io/master:NoSchedule` taint.
2. Cheat Sheet: The Scaling Triad
| Feature | HPA | VPA | Cluster Autoscaler |
|---|---|---|---|
| Direction | Horizontal (More Replicas) | Vertical (Larger Replicas) | Infrastructure (More Nodes) |
| Trigger | CPU/Mem Utilization > Target | Historical Usage > Request | Pending Pods |
| Action | Updates replicas in Deployment |
Updates requests in Pod Spec |
Calls Cloud Provider API |
| Downtime | None (Zero downtime) | Yes (Pod Restart) | None (for existing pods) |
| Best For | Stateless Microservices | Java Apps / Databases / Monoliths | Any Cluster |
3. Next Steps
Now that your cluster scales perfectly, how do you secure it?