CDN: The Global Warehouse
[!TIP] Interview Insight: If asked “How do you scale a news website?”, the answer is almost always CDN. It offloads 95%+ of traffic from your servers.
1. The “Amazon Prime” Analogy
Imagine you are selling a popular book. Your warehouse is in Seattle.
- Customer A (London) orders it. Delivery: 5 days.
- Customer B (Tokyo) orders it. Delivery: 7 days.
Solution: You rent small warehouses (PoPs) in London, Tokyo, and New York. You ship bulk copies there before orders come in. Now delivery takes 1 day. CDN (Content Delivery Network) does this for data.
2. Under the Hood: How it Works
A. Points of Presence (PoPs)
A CDN isn’t one server; it’s thousands of servers scattered across the globe. These locations are called PoPs (Points of Presence).
- Origin Server: The “Source of Truth” (Your S3 Bucket or EC2 instance).
- Edge Server: The CDN server physically closest to the user.
B. Anycast DNS (The Magic)
How does the user find the nearest PoP?
- Normally (Unicast), one IP address = one specific server.
- Anycast: One IP address (e.g.,
1.1.1.1) maps to multiple physical servers globally. - When you ping
1.1.1.1, the BGP (Border Gateway Protocol) routing protocol automatically routes you to the physically closest server.
Anycast DNS Simulator
1.1.1.13. The Security Shield: DDoS Protection
CDNs act as a massive distributed shield. When a DDoS Attack (Distributed Denial of Service) happens, millions of bad requests flood the target IP.
- Without CDN: Your single origin server gets 1M req/sec. It dies instantly.
- With CDN: The 1M requests are scattered across 200+ PoPs globally.
- PoP A takes 5000 requests.
- PoP B takes 5000 requests.
- The attack is diluted by the sheer size of the CDN network.
4. Controlling the Cache (Headers)
How does the CDN know what to cache and for how long?
A. Cache-Control
The most important HTTP header.
public: Can be cached by anyone (CDN, ISP, Browser).private: Can only be cached by the User’s Browser (e.g., “My Profile” page).no-store: Never cache anything (Banking data).max-age=3600: Cache for 1 hour (3600 seconds).s-maxage=3600: Shared Max Age. Overridesmax-agefor CDNs only. Useful if you want browsers to cache for 1 minute (max-age=60) but CDNs for 1 hour.
B. Invalidation (Purging)
What if you update your website? The CDN still has the old version!
- TTL Expiry: Wait for
max-ageto expire (Passive). - Purge: Manually tell the CDN “Delete
/index.html”. This propagates globally in seconds (Active). - Versioning: Change the filename (
style-v2.css). This forces a cache miss.
5. The Future: Edge Computing (Workers)
Modern CDNs (Cloudflare, AWS CloudFront) are no longer just “dumb caches”. They run code.
Edge Workers allow you to execute JavaScript/WASM functions at the PoP, milliseconds away from the user.
Why do we need this?
Normally, any logic (Authentication, Database Logic) requires a trip to the Origin Server (Slow). Edge Workers let you run small pieces of logic at the Edge (Fast).
Common Use Cases
- Authentication: Verify a JWT token at the edge. If invalid, reject the request immediately (latency: 10ms). Don’t even bother the Origin.
- A/B Testing: Assign a user to “Experiment Group A” or “B” at the edge and serve different HTML.
- Custom Headers: Add security headers (
HSTS,X-Frame-Options) on the fly. - Geo-Routing: Detect user country and redirect to
uk.site.comorus.site.cominstantly.
Edge Auth Flow
pass()
} else {
block()
}
Example: Cloudflare Worker
addEventListener('fetch', event => {
event.respondWith(handleRequest(event.request))
})
async function handleRequest(request) {
// 1. Check Auth at the Edge (No trip to Origin!)
const token = request.headers.get('Authorization')
if (!isValid(token)) {
return new Response('Forbidden', { status: 403 })
}
// 2. Personalize
return new Response('Hello from the Edge!', { status: 200 })
}
6. Interactive Demo: Global Latency Map & Purge
Visualize the speed of light.
- Origin: San Francisco (USA).
- User: London (UK).
- Action: Fetch via CDN. The first time is slow (Miss). The second time is fast (Hit).
- Purge: Clear the cache and force a new Miss.
- Edge Worker: Simulate logic running at the edge (no trip to Origin).
7. Summary
- Static Content: Always put it on a CDN.
- Dynamic Content: Use Edge Computing or Cache-Control headers wisely.
- Security: CDNs also provide DDoS protection (they act as a giant shield).