Design a URL Shortener (TinyURL)

1. What is a URL Shortener?

A URL Shortener is a service that creates shorter aliases for long URLs. When a user clicks these short aliases, they are redirected to the original long URL. This service is a staple of system design interviews because it touches on all the fundamentals: hashing, databases, caching, and concurrency.

Real-World Examples:

TinyURL: The classic example.
Bit.ly: Focuses on analytics and enterprise features.
t.co: Twitter’s internal shortener to save character count.

[!TIP] Try it yourself: Imagine you are sending an SMS. Would you rather type https://www.google.com/search?q=system+design+interview+guide... or https://tiny.url/x7z9? The latter is cleaner, cheaper (SMS cost), and easier to type.

2. Requirements & Goals

Functional Requirements

Shorten: Given a long URL, generate a unique short alias (e.g., http://tiny.url/j8s9).
Redirect: Given a short alias, redirect the user to the original long URL.
Custom Alias (Optional): Users should be able to specify a custom alias (e.g., tiny.url/my-blog).
Expiry: Links should expire after a default timespan (e.g., 5 years) to save space.

Non-Functional Requirements

Availability: The system must be highly available. If the service is down, all URL redirections fail (Global Impact).
Latency: URL redirection happens in real-time. Minimizing latency (< 100ms) is critical.
Read-Heavy: There will be many more redirections (reads) than new link creations (writes). A 100:1 ratio is common.

Extended Requirements

Analytics: Track click counts, location, and user agent.
REST API: Allow other services to generate links programmatically.

3. Capacity Estimation

Traffic Estimates

Let’s assume 100 Million new URLs are generated per month.

Write QPS: 100,000,000 / (30 × 24 × 3600) ≈ 40 writes/sec
Read QPS (100:1 ratio): 40 × 100 = 4,000 reads/sec
Peak Traffic: Assume 5x multiplier for spikes. Write: 200 QPS, Read: 20k QPS.

Storage Estimates

Let’s assume we store the data for 5 years.

Total URLs: 100M × 12 months × 5 years = 6 Billion records
Average URL size: 500 bytes.
Total Storage: 6 Billion × 500 bytes = 3 TB

Bandwidth Estimates

Ingress (Write): 40 writes/sec × 500 bytes ≈ 20 KB/s (Negligible).
Egress (Read): 4,000 reads/sec × 500 bytes ≈ 2 MB/s.
- Note: This is just for the payload. If we include 302 redirects, the browser handles the payload of the target site, but our server just sends headers.

[!WARNING] While 3TB fits on a single modern hard drive, we distribute it across multiple nodes for Throughput (IOPS) and Reliability (Replication), not just capacity.

4. System APIs

We will use REST-style APIs for simplicity and compatibility.

Create Short URL

POST /api/v1/shorten

Request Body:

{
  "longUrl": "https://www.very-long-website.com/article/123",
  "customAlias": "my-article",
  "expireDate": "2025-12-31"
}

Response:

200 OK: Success.
400 Bad Request: Invalid URL or Custom Alias already exists.
429 Too Many Requests: Rate limit exceeded.

{
  "shortUrl": "https://tiny.url/my-article",
  "error": null
}

Redirect

GET /{short_alias}

Response:

Status: 301 or 302 (See below).
Location: https://www.very-long-website.com/article/123
Cache-Control: public, max-age=3600 (If 301).

5. Database Design

We need to store 6 Billion records. The data is simple: ShortKey -> LongURL.

SQL Schema (MySQL/PostgreSQL)

CREATE TABLE urls (
    id BIGINT AUTO_INCREMENT PRIMARY KEY, -- 8 bytes, used for Base62
    short_key VARCHAR(7) NOT NULL,        -- 7 bytes (Indexed)
    long_url VARCHAR(2048) NOT NULL,      -- 2 KB max
    user_id BIGINT,                       -- Foreign Key
    created_at TIMESTAMP DEFAULT NOW(),
    expiration TIMESTAMP
);
CREATE INDEX idx_short_key ON urls(short_key);

NoSQL Schema (DynamoDB/Cassandra)

Since we don’t need complex joins, NoSQL is a great fit for horizontal scaling.

{
  "pk": "short_key:x7z9",  // Partition Key
  "long_url": "https://...",
  "created_at": 1672531200
}

Trade-off: SQL provides ACID guarantees (useful for uniqueness checks). NoSQL provides easier scaling. For this problem, both work well, but SQL is often preferred for the AUTO_INCREMENT feature which aids Base62 encoding.

6. High-Level Design

The architecture separates the Write Path (Shortening) from the Read Path (Redirecting) to optimize for the 100:1 read ratio.

System Architecture: URL Shortener

Write Path (Green) vs Read Path (Blue)

Shorten (Write)

Redirect (Read)

Client

LB Layer

Application Services

Data Persistence

Users

Load Balancer

Shortener Svc

• Generates ID
• Writes to DB

Redirect Svc

• Checks Cache
• Returns 302

KGS

Key Gen Service

Redis (Keys)

Pre-gen keys

Main DB

Mapping Table

Shard 1 Shard 2

Redis Cluster

LRU Cache for Hot URLs (20%)

Kafka: clicks_topic

Worker

Aggregates Data

7. Component Design: Generating Keys

This is the core problem. How do we generate a short, unique string like x7z9?

Approach A: Hashing (MD5/SHA256)

Hash the Long URL: MD5(long_url).

Result is 128-bit string (32 hex chars).
Truncate to 7 characters.
Problem: Collisions. Two different URLs might hash to the same first 7 characters.
Resolution: If collision, append a salt (e.g., current time) and re-hash. This requires checking the DB for every write, which is slow (Bloom Filters can help, but it’s still probabilistic).

Approach B: Base64 Encoding

Base64 uses A-Z, a-z, 0-9, +, /.

Problem: The characters + and / are reserved in URLs. They might be interpreted as spaces or directory separators.
Resolution: Use “Base64URL” variant (- and _ instead). However, Base62 is standard for this use case.

Approach C: Base62 Conversion (The Winner)

We use a Counter (Database ID or Distributed ID) and convert it to Base62. This is a Bijective (one-to-one) mapping, guaranteeing no collisions.

Base62 Characters: 0-9 (10) + a-z (26) + A-Z (26) = 62.
Why 7 characters? 62⁷ ≈ 3.5 Trillion combinations. Enough for 100 years.

Algorithm:

Get Unique ID: 123456789
Convert to Base62: 8M0kX
Short URL: tiny.url/8M0kX

Key Generation Service (KGS)

To scale, we don’t want the App Server to hit the Database for a new ID every time.

KGS: A dedicated service (see “KGS” in the diagram) that pre-generates keys (e.g., abc1, abc2) and stores them in a Redis Set (labeled “Redis (Keys)”).
Concurrency: App servers POP a key from Redis. This is atomic and fast.
Two Tables:
1. unused_keys: Keys ready to be assigned.
2. used_keys: Keys already assigned.
Trade-off: If the KGS crashes, we lose the pre-generated keys in memory. This is acceptable as we have trillions of keys available.

8. Data Partitioning & Sharding

To handle 6 Billion records and 4000 reads/sec (per node), we must shard the database.

Strategy 1: Range-Based Sharding

Method: Partition by the first character of the Short Key.
- Shard A: Keys starting with A-C
- Shard B: Keys starting with D-F
Pros: Simple to implement.
Cons: Unbalanced Load. If many URLs start with ‘A’, Shard A becomes a hotspot.

Strategy 2: Hash-Based Sharding (Consistent Hashing)

Method: ShardID = hash(short_key) % NumShards.
Pros: Even Distribution. Traffic is spread uniformly across all nodes.
Cons: Re-sharding (adding nodes) is complex, requiring data migration. We use Consistent Hashing to minimize movement.

Verdict: Use Hash-Based Sharding on the short_key.

9. Reliability, Caching, & Load Balancing

301 vs 302 Redirects

301 Moved Permanently: Browser caches the redirect. Subsequent requests bypass our server. Fast, but we lose analytics.
302 Found: Browser hits our server every time. Slower, but allows accurate tracking.
Choice: Use 302 if analytics are required.

Caching Strategy

Cache-Aside: App checks Redis -> Miss -> DB -> Write to Redis.
Eviction: Use LRU (Least Recently Used).
Pareto Principle: 20% of URLs generate 80% of traffic. Caching just the top 20% fits easily in memory.

Reliability

Replication: Use Master-Slave replication for the DB. If Master fails, promote Slave.
Geo-DNS: Use a CDN or Geo-DNS to route users to the nearest data center.

10. Interactive Decision Visualizer: Hash vs Base62

Explore why we prefer Base62 over simple Hashing for this specific problem.

**The Gold Standard**: Convert a unique Integer ID (from Database or Snowflake) into a Base62 string.
Bijective Zero Collisions

Database ID (Auto-Increment):

ID: 123456789

→

Base62 Algo

Divide by 62

→

Key: 8M0kX

11. System Walkthrough (Dry Run)

Let’s trace a request through the system.

Scenario: User creates a short link

1. Request User sends POST /shorten to the Load Balancer.

{
  "longUrl": "https://www.google.com"
}

2. Key Generation The Shortener Service contacts Redis (KGS) to get a unique key.

Redis Command: SPOP unused_keys 1
Result: "8M0kX"

3. Persistence The service writes to the Primary DB.

INSERT INTO urls (short_key, long_url) VALUES ('8M0kX', 'https://www.google.com');

4. Response Service returns the short URL.

{
  "shortUrl": "https://tiny.url/8M0kX"
}

Scenario: User clicks a short link

1. Request User visits https://tiny.url/8M0kX. Request hits Redirect Service.

2. Cache Check Service checks Redis Cache.

Redis Command: GET url:8M0kX
Result (Hit): "https://www.google.com"
Result (Miss): nil -> Read from DB -> Write to Redis.

3. Analytics (Async) Service pushes an event to Kafka.

{
  "topic": "clicks_topic",
  "key": "8M0kX",
  "payload": { "ts": 1672531200, "ua": "Mozilla/5.0", "ip": "1.2.3.4" }
}

4. Redirect Service returns HTTP 302.

Header: Location: https://www.google.com

12. Interview Gauntlet

Be ready for these follow-up questions:

What if two users shorten the same long URL?
- Ans: You can return the same short key (deduplication) to save space, or generate a new one. Deduplication requires checking the DB first (slow), so most systems just generate a new key.
How do you prevent users from guessing IDs?
- Ans: Base62 is sequential. Add a random salt before encoding, or shuffle the alphabet.
What if the Redis KGS runs out of keys?
- Ans: A background worker monitors the size of the unused_keys set. When it drops below 10k, it fetches another batch from the KGS database.
How do you handle DB sharding?
- Ans: Shard by short_key (hash-based). This ensures evenly distributed traffic.
How do you implement custom aliases?
- Ans: Check if the alias exists in the DB. If not, insert it. This requires a unique constraint on the short_key column.
What happens if the cache is full?
- Ans: Use an LRU (Least Recently Used) eviction policy.
How do you filter malicious URLs?
- Ans: Run the long URL against a blacklist (e.g., Google Safe Browsing API) before shortening.
Can we use Zookeeper for ID generation?
- Ans: Yes, ZK can manage ranges (e.g., Server A gets 1-1000, Server B gets 1001-2000).
Why not use UUIDs?
- Ans: UUIDs are 36 characters. Base62 of a UUID is still too long (~22 chars). We want short URLs (7 chars).
How to handle heavy analytics writes?
- Ans: Never write to DB on the read path. Use Kafka to buffer events and write in batches to a Columnar DB (ClickHouse).

13. Whiteboard Summary

1. Core Requirements

Shorten (Write): 40 QPS
Redirect (Read): 4k QPS (100:1)
Unique 7-char Key (Base62)
301 vs 302 Redirect

2. Key Architecture

Write Path: KGS -> Redis -> DB
Read Path: Cache -> DB
Async: Kafka for Analytics
Sharding: Hash(short_key)

3. Data Models

URLs Table: `id`, `short_key`, `long_url`
Redis: `unused_keys` (Set), `url:{key}` (String)
Analytics: Columnar DB (ClickHouse)

4. Bottlenecks & Fixes

Collision: Use Pre-generated Keys (KGS).
Latency: Cache 20% hot URLs (Pareto).
Analytics Lag: Use Kafka (Async).
DB Space: TTL or cleanup job.