What Is a CDN?
A Content Delivery Network is a geographically distributed set of servers that cache and serve content from locations close to end users. The goal is simple: reduce latency by eliminating the round-trip to your origin server.
Without a CDN, a user in Mumbai requesting your JavaScript bundle stored in us-east-1 endures ~200 ms of round-trip time before the first byte even arrives. Add TLS handshake (~150 ms), TCP slow start, and HTTP request overhead — your 50 KB asset takes 500+ ms. With a CDN edge node in Mumbai, that same request completes in 8–15 ms.
Real numbers from production:
- Cloudflare: 300+ PoPs, serves ~20% of all web traffic
- Akamai: 4,000+ PoPs, 130 Tbps capacity
- Fastly: ~90 PoPs, optimized for real-time cache purging (<150 ms global purge)
- AWS CloudFront: 450+ PoPs across 90+ cities
Points of Presence (PoPs) and Edge Nodes
A Point of Presence (PoP) is a physical datacenter location operated by the CDN. Each PoP contains multiple edge servers (cache nodes) that store cached copies of your content.
┌─────────────────────┐
│ Origin Server │
│ (us-east-1) │
└──────────┬──────────┘
│
┌──────────────────────┼──────────────────────┐
│ │ │
┌────────▼───────┐ ┌─────────▼──────┐ ┌─────────▼──────┐
│ PoP: London │ │ PoP: Mumbai │ │ PoP: Tokyo │
│ (edge nodes) │ │ (edge nodes) │ │ (edge nodes) │
└────────┬───────┘ └───────┬────────┘ └────────┬───────┘
│ │ │
EU users IN/SG users JP/KR users
Tiered Caching (Origin Shield)
A naive CDN has every edge node independently fetching from origin on cache miss — a thundering-herd problem during traffic spikes. Origin Shield (also called mid-tier caching) adds a regional aggregation layer:
Edge Node (London) ──┐
Edge Node (Paris) ──┼──► Shield Node (EU) ──► Origin
Edge Node (Dublin) ──┘ (single cache)
Only the Shield node talks to origin. Edge nodes that miss go to the Shield node, not origin. This collapses N simultaneous cache misses into 1 origin request. AWS CloudFront calls this “Origin Shield”; Cloudflare calls it “Tiered Cache”; Fastly calls it “Shielding”.
Cache hit rate improvement: A large media platform moving from flat to tiered caching typically sees origin traffic drop by 60–80%.
How DNS Routes You to the Nearest Edge
CDNs use Anycast or DNS-based GeoDNS to direct users to the nearest PoP.
GeoDNS (Used by Cloudflare, Akamai)
The CDN operates authoritative DNS servers that return different IP addresses based on the resolver’s geographic location:
User in Mumbai resolves cdn.example.com:
1. Browser asks local resolver (Jio DNS: 49.44.x.x)
2. Local resolver asks CDN authoritative DNS
3. CDN DNS sees resolver IP → geolocates to India
4. Returns IP of Mumbai PoP: 104.18.x.x
5. Browser connects to Mumbai edge node
TTL on these DNS records is typically 30–60 seconds — short enough for failover, long enough to reduce DNS query volume.
Anycast Routing (Used by Cloudflare, Fastly)
With Anycast, the same IP address is announced from multiple PoPs via BGP. Internet routers naturally route packets to the topologically nearest announcement of that IP.
IP: 104.16.0.0/16 announced from:
- London PoP (AS13335, via Telia)
- Mumbai PoP (AS13335, via Tata)
- Tokyo PoP (AS13335, via KDDI)
User packet destined to 104.16.x.x:
BGP best-path → lowest AS-hop count → nearest PoP wins
Anycast advantage over GeoDNS: No DNS TTL delay for failover. If a PoP goes down, BGP reconverges in seconds and traffic reroutes automatically.
BGP Basics (What You Need to Know)
Border Gateway Protocol is the routing protocol of the Internet — it determines how traffic flows between autonomous systems (ASes). Each CDN operates one or more ASes and peers with major ISPs and Internet Exchange Points (IXPs like DE-CIX, AMS-IX) to get traffic on-net quickly.
Key concepts:
- AS Path: sequence of ASes a route traverses (shorter = preferred)
- BGP Peering: two ASes exchange routes without payment (peers) vs. Transit: paying an upstream to carry your traffic
- IXP: physical facility where many networks interconnect cheaply — Cloudflare peers at 285+ IXPs
Cache-Control Headers and TTLs
The CDN honours HTTP cache semantics. Understanding these headers is essential for controlling what gets cached, for how long, and by whom.
HTTP/1.1 200 OK
Cache-Control: public, max-age=31536000, immutable
ETag: "abc123def456"
Vary: Accept-Encoding
Key Directives
| Directive | Meaning |
|---|---|
public | Response may be cached by CDN and shared caches |
private | Only browser cache; CDN must not store |
max-age=N | Cache for N seconds from response time |
s-maxage=N | CDN-specific override of max-age |
no-cache | Must revalidate with origin before serving (but can store) |
no-store | Must not cache at all |
immutable | Content will never change; skip revalidation even on reload |
stale-while-revalidate=N | Serve stale for N seconds while fetching fresh in background |
stale-if-error=N | Serve stale for N seconds if origin returns 5xx |
Practical TTL Strategy
/static/app.abc123.js → max-age=31536000, immutable (content-hashed, forever)
/api/v1/products → s-maxage=60, stale-while-revalidate=30 (60s CDN cache)
/api/v1/user/profile → private, no-store (never cache at CDN)
/images/hero.jpg → s-maxage=86400 (1 day CDN, but purgeable)
The golden rule: Content-addressed assets (filename contains hash of content) should be cached forever. Mutable paths should have short TTLs or rely on explicit purging.
CDN Invalidation Strategies
When you deploy new content, stale copies at edge nodes must be evicted. You have three approaches:
1. URL-Based Purge (Surgical)
# Cloudflare API purge single URL
curl -X POST "https://api.cloudflare.com/client/v4/zones/{zone_id}/purge_cache" \
-H "Authorization: Bearer $TOKEN" \
-d '{"files":["https://example.com/api/v1/products"]}'
# Fastly: sub-150ms global propagation
curl -X PURGE "https://api.fastly.com/service/{id}/purge/api/v1/products" \
-H "Fastly-Key: $TOKEN"
2. Tag/Surrogate-Key Based Purge (Scalable)
Origin tags responses with a header listing logical cache keys:
Surrogate-Key: product-42 category-electronics user-public
Cache-Tag: product-42 category-electronics
When product 42 is updated, a single API call purges every URL tagged product-42 — regardless of how many URLs reference it. Fastly and Cloudflare both support this. Akamai calls it “Fast Purge by tag.”
3. Cache-Busting via Versioned URLs (No Purge Needed)
<!-- Old approach: mutable URL, needs purging -->
<script src="/app.js"></script>
<!-- New approach: content-hashed URL, set max-age=forever -->
<script src="/app.abc123de.js"></script>
Deploy pipelines (Webpack, Vite) generate content-hashed filenames automatically. New deployment = new filename = CDN naturally fetches fresh copy on first request.
Trade-off: Requires serving a short-TTL (or no-cache) HTML file that references the hashed assets. The HTML is the only thing that needs purging on deploy.
Push vs Pull CDNs
Pull CDN (Standard)
Edge nodes fetch from origin on cache miss, then cache the response. Origin doesn’t need to proactively push content.
First request to edge: MISS → fetch from origin → cache → serve (slow)
Subsequent requests: HIT → serve from cache (fast)
Used by: Cloudflare, Fastly, CloudFront, Akamai (for most use cases)
Best for: Dynamic or semi-dynamic content, sites with unpredictable access patterns.
Push CDN
You explicitly upload content to CDN storage. Edge serves directly from CDN object storage.
# Example: rclone push assets to CloudFront's S3 origin
rclone sync ./dist/ s3:my-bucket/assets/ --checksum
# CloudFront serves from S3 without ever hitting your app server
Used by: AWS CloudFront + S3, Cloudflare R2 + Pages
Best for: Large static assets (video, software downloads) where you control deployment timing. Video streaming (HLS/DASH segments pushed to CDN before user requests).
When NOT to Use a CDN
This is a critical interview point. CDNs are not universally beneficial:
-
Highly personalized responses — if every API response is unique per user, cache hit rate approaches 0% and CDN adds latency (extra hop) with zero benefit.
-
WebSocket / long-lived connections — CDNs add connection overhead. Cloudflare and AWS do proxy WebSockets, but you pay per message and add latency. Dedicated WebSocket infrastructure is often better.
-
Private/sensitive data — financial transactions, medical records. Even with TLS termination at edge, you’re routing sensitive data through a third party’s infrastructure. Consider regulatory requirements (GDPR, HIPAA).
-
Low-latency write-heavy APIs — CDNs are caches (read-optimised). A trading platform processing order submissions doesn’t benefit from CDN.
-
Internal services — CDNs are designed for public internet. Internal APIs behind VPN have no use for a global CDN.
HTTP/2 and HTTP/3 (QUIC)
HTTP/2 Key Improvements
HTTP/1.1 is head-of-line blocked: one request at a time per TCP connection (without pipelining). Browsers open 6 parallel TCP connections per origin as a workaround.
HTTP/2 introduces:
- Multiplexing: multiple requests/responses over a single TCP connection
- Header Compression (HPACK): repetitive headers (Cookie, User-Agent) compressed, saving 50–90% header bandwidth
- Server Push: proactively send assets before client requests them (mostly abandoned in practice — poor cache interaction)
- Binary framing: more efficient parsing than text
HTTP/1.1: [request1] → wait → [response1] [request2] → wait → [response2]
HTTP/2: [req1][req2][req3] → [res2][res1][res3] (interleaved, no ordering)
HTTP/2 TCP head-of-line blocking: Even HTTP/2 can be blocked if a TCP packet is lost — all streams stall waiting for retransmission. This is solved by QUIC.
HTTP/3 / QUIC
QUIC replaces TCP with a UDP-based protocol that implements reliability at the QUIC layer, not the transport layer. Each stream is independently reliable — packet loss in stream 1 doesn’t block stream 2.
Key improvements over HTTP/2:
- 0-RTT connection resumption: known servers → 0 round trips before sending data
- Connection migration: change network (WiFi → cellular) without dropping connection — connection identified by Connection ID, not (IP, port) tuple
- Reduced handshake: TLS 1.3 integrated into QUIC handshake (1 RTT vs 2 RTT for TCP + TLS)
TCP + TLS 1.3:
Client → SYN (1 RTT)
Client → ClientHello, HTTP req (2 RTT)
Client receives data (3 RTT total before first byte)
QUIC + TLS 1.3:
Client → Initial + ClientHello + HTTP req (1 RTT)
Client receives data (1 RTT + data)
QUIC 0-RTT (resumed session):
Client → Initial + HTTP req (0 RTT — data sent immediately)
Real impact: Google reports 3–8% latency improvement for video streaming with QUIC. Cloudflare: 15% faster page loads for mobile users on lossy networks.
As of 2024, ~29% of web traffic uses HTTP/3 (Cloudflare data).
TLS Termination at the Edge
CDNs terminate TLS at the edge node, not at your origin. This means:
User ──[TLS]──► Edge Node ──[TLS or plain HTTP]──► Origin
Two TLS sessions:
1. User ↔ Edge (public internet, full TLS)
2. Edge ↔ Origin (private network, often TLS-optional but should be enforced)
Benefits:
- Expensive TLS handshake happens close to user (low RTT)
- CDN handles certificate renewal (Cloudflare Universal SSL, AWS ACM)
- Origin only needs to handle decrypted requests (CPU savings)
Security consideration: Traffic between edge and origin traverses CDN’s internal network. For sensitive systems, always enforce TLS on the origin leg and verify the CDN’s certificate pinning story. Cloudflare’s Authenticated Origin Pulls uses a client certificate so your origin only accepts connections from Cloudflare.
TLS Session Resumption
TLS 1.3 supports session tickets: after the initial handshake, the server gives the client an encrypted ticket. On reconnect, the client presents the ticket — server decrypts it, session keys restored, skipping the full handshake.
CDNs with large global session ticket key infrastructure (Cloudflare distributes session ticket keys across all PoPs) allow users to resume TLS sessions even when routed to a different PoP.
The Full Request Flow: DNS to Bytes
Putting it all together — every step a browser takes to load https://example.com/app.js:
1. DNS Resolution
Browser → Local Resolver → CDN Authoritative DNS
CDN returns: 104.18.7.96 (nearest PoP IP)
Cost: ~5-50ms (cached resolver) or ~100ms (cold)
2. TCP Handshake (if HTTP/1.1 or HTTP/2)
SYN → SYN-ACK → ACK
Cost: 1 RTT to edge (e.g., 8ms Mumbai→Mumbai PoP)
3. TLS 1.3 Handshake
ClientHello → ServerHello + Cert + Finished → Client Finished
Cost: 1 RTT (TLS 1.3) to edge
4. HTTP Request
GET /app.js HTTP/2
Host: example.com
Cost: 1 RTT
5. CDN Cache Lookup
HIT: serve from memory/SSD (~0.1ms)
MISS: fetch from origin (~50-200ms depending on geography)
6. Response Transfer
Content-Length: 51200
Transfer-Encoding: gzip, br
Cost: depends on file size and bandwidth
Total for cache HIT: ~10 RTT ms (near PoP) vs ~400 ms (origin, far away)
Content Negotiation and Optimisation
Modern CDNs do more than cache — they transform content:
Compression
Accept-Encoding: br, gzip, deflate, zstd
CDN serves Brotli (br) if client supports it:
gzip: 50 KB → 15 KB (70% reduction)
brotli: 50 KB → 12 KB (76% reduction)
zstd: 50 KB → 11 KB (78% reduction, fastest decompression)
CDNs compress at edge and cache the compressed variants separately (keyed by Vary: Accept-Encoding).
Image Optimisation
/image.jpg → CDN detects Accept: image/webp → serves WebP variant
CDN detects DPR: 2 header → serves 2x image
CDN detects width from viewport → serves resized variant
Cloudflare Polish, Imgix, and Fastly IO do this at edge. Saves 30–60% on image payload.
CDN Metrics to Monitor
| Metric | Target | Description |
|---|---|---|
| Cache Hit Ratio | >90% | % of requests served from cache |
| Origin Error Rate | <0.1% | 5xx from origin seen by CDN |
| Edge Latency (p99) | <50ms | Time to first byte from edge |
| Bandwidth Saved | >80% | Origin bandwidth offset by CDN |
| Purge Propagation Time | <1s (Fastly) / <60s (CloudFront) | How long purge takes globally |
Interview Checklist
Fundamentals
- Explain the difference between GeoDNS and Anycast routing
- Describe the tiered caching (origin shield) pattern and why it exists
- What happens when a CDN cache misses? Trace the full request flow
- Explain
Cache-Control: s-maxagevsmax-age - What is
stale-while-revalidateand when would you use it?
Invalidation
- Compare URL purge vs surrogate-key (tag) purge — when does each scale better?
- Why is cache-busting via content-hashed URLs often better than purging?
- What is the problem with purging dynamic API responses at scale?
Protocol
- How does HTTP/2 multiplexing help? What is its remaining limitation?
- How does QUIC solve TCP head-of-line blocking?
- What happens during TLS 1.3 0-RTT and what are the security tradeoffs?
Architecture
- When would you NOT use a CDN? Give 3 concrete scenarios
- How does TLS termination at the CDN edge affect your security model?
- Design a caching strategy for a news site (mix of static assets, article pages, breaking news API)
- How would you serve 1 billion video streams globally? Walk through the CDN architecture
Senior-Level
- How does BGP Anycast enable sub-second failover for CDN PoPs?
- Design a CDN purge system that guarantees all 300+ PoPs are updated within 500 ms
- What are the compliance/regulatory risks of routing PII through a third-party CDN?