Design Dropbox / Google Drive

[!NOTE] This module explores the core principles of Design Dropbox / Google Drive, deriving solutions from first principles and hardware constraints to build world-class, production-ready expertise.

1. What is Dropbox?

Dropbox is a cloud-based file hosting service that offers file synchronization, personal cloud, and client software. It allows users to create a special folder on their computers, which Dropbox synchronizes so that it appears to be the same folder (with the same contents) regardless of which device is used to view it.

Why is this hard?

Scale: Billions of files. Exabytes of data.
Consistency: If I edit a file on my laptop, my phone must see the change immediately.
Bandwidth: Uploading a 1GB file every time I change one character is wasteful.
Reliability: Data loss is unacceptable (11 9s durability).

[!TIP] In an interview, clarify if you are designing for Collaborative Editing (Google Docs - OT/CRDT) or File Storage (Dropbox - Whole file locking/versions). We are focusing on File Storage.

Try it yourself: Open Dropbox or Google Drive on two devices. Upload a file on one. Watch how fast it appears on the other. Now disconnect the internet on one device, edit the file, and reconnect. This demonstrates conflict resolution in action.

2. Requirements & Goals

Functional Requirements

Add/Delete/Update File: Users can perform standard file operations.
Sync: Changes propagate to all devices automatically.
Versioning: Users can restore previous versions of a file.
Sharing: Users can share files with permissions.

Non-Functional Requirements

Reliability: Data must never be lost. We need Strong Consistency for metadata.
Availability: The service should be always up, but Consistency > Availability (ACID for metadata).
Bandwidth Efficiency: Minimize network usage for large files using Block-Level Deduplication.

3. Capacity Estimation

Let’s assume 500 Million Users and 100 Million DAU.

Storage

Avg user has 10 GB of files.
Total Storage = 500M × 10 GB = 5 Exabytes (EB).
We need cold storage strategies (e.g., AWS Glacier) for old versions.

Throughput

Assume 1 file change per active user per day.
QPS = 100M / 86400 ≈ 1,200 QPS (Write).
Read QPS might be equal or slightly higher (syncing to multiple devices).
Peak QPS: ~2,500 QPS.

4. System APIs

We need a RESTful API for the client. The client is “thick” (smart), handling chunking logic.

4.1 Upload Flow (Presigned URL)

The client first calculates hashes of all chunks (e.g., 4MB each) and asks the server which ones are new.

Step 1: Initiate Upload & Deduplication Check

POST /files/upload_init
Authorization: Bearer <token>

{
  "filename": "video.mp4",
  "path": "/photos/vacation",
  "size_bytes": 10485760,
  "chunk_hashes": ["hashA", "hashB", "hashC"]
}

Response:

{
  "upload_id": "up_12345",
  "missing_chunks": ["hashC"], // Server already has A and B!
  "presigned_urls": {
  "hashC": "https://s3.aws.com/bucket/hashC?signature=..."
  }
}

Step 2: Upload Raw Chunk (Direct to S3)

PUT https://s3.aws.com/bucket/hashC?signature=...
Body: <binary_data_of_chunk_C>

Step 3: Commit File Metadata

POST /files/commit
Authorization: Bearer <token>

{
  "upload_id": "up_12345",
  "chunk_list": ["hashA", "hashB", "hashC"]
}

4.2 Download Flow

GET /files/download?file_id=file_987
Authorization: Bearer <token>

// Response: 302 Redirect to Presigned S3 URL

4.3 Sync Flow (Long Polling)

The client waits for changes.

POST /sync/poll
Authorization: Bearer <token>

{
  "cursor_id": 5021 // Last seen change ID
}

5. Database Design

We separate Metadata from Block Data.

Metadata DB (SQL/NewSQL)

Stores file hierarchy, permissions, and mapping to blocks (visualized as Metadata DB). We need ACID transactions.

Table Files: file_id (PK), user_id, latest_version, is_directory, created_at.
Table Versions: version_id (PK), file_id, block_list (Array of hashes), size.
Table Blocks: hash (PK), s3_url, ref_count (For Garbage Collection).

Block Storage (Object Store)

Stores the actual raw data chunks (visualized as Block Storage / S3).

S3 / Azure Blob: Immutable. Key is the SHA-256 hash of the content.
Why Separation?: Metadata requires fast updates and complex queries (SQL). Block data is massive and write-once-read-many (Blob Store).

6. High-Level Design

File Sync Architecture. Separating Metadata from Block Data.

System Architecture: Files & Sync

Metadata Service | Block Storage | Notification Service

Upload Data Path

Metadata/Sync Path

Notification

Client

Services

Storage Layer

Sync Client

• Watcher • Chunker • Indexer • DB (SQLite)

Load Balancer

Block Service

Writes Raw Chunks Presigned URLs

Metadata Service

File Commit Deduplication Check Sync Logic

Notification Svc

Long Polling / WS Push Changes

Block Storage

S3 Object Store Key: SHA-256 Hash

Metadata DB

MySQL / Spanner File Hierarchy Versions

Glacier

Cold Storage

6.1 System Walkthrough (Happy Path)

Let’s trace the lifecycle of a file upload:

Chunking: The Client splits the file (e.g., video.mp4) into 4MB chunks and calculates the SHA-256 hash for each chunk.
Deduplication Check: Client sends the list of hashes to the Metadata Service.
Server Response: The server checks the Metadata DB and responds: “I already have chunk A and B. Only upload chunk C.”
Upload: The Client uploads the raw bytes of chunk C directly to the Block Service (S3).
Commit: Once the upload is confirmed, the Client sends a commit request to the Metadata Service, linking the file video.mp4 to the list of hashes [A, B, C].
Notification: The Metadata Service commits the transaction and pushes a “New Version” event to the Notification Service.
Sync: Other active devices (listening via Long Polling/WebSocket) receive the event and begin downloading the new metadata and missing chunks.

7. Component Design (Deep Dive)

Chunking & Block-Level Deduplication

Instead of uploading the whole file, we split it into 4MB Chunks.

Calculate SHA-256 of each chunk.
Check Existence: Client sends list of hashes to server.
Deduplication: Server responds: “I already have Hash A and B (from other users!). Only upload Hash C.”
Benefit: Saves bandwidth and storage. If 100 people upload the same movie, we store it once.

Delta Sync & Rolling Hash

What if I insert one character at the beginning of a file?

Fixed-Size Chunking (The Problem): Every 4MB chunk shifts by 1 byte. Even though the content is 99.9% identical, ALL hashes change because the boundaries have shifted. This breaks deduplication and forces a full re-upload.
Rolling Hash (Rabin-Karp): Instead of splitting by byte offset (0, 4MB, 8MB), we use a Sliding Window algorithm to calculate a checksum for every byte window (e.g., 32 bytes).
The Rule: If Checksum % M == 0, create a new chunk boundary.
The Result: Boundaries are determined by the content itself. If you insert a word at the start, only the first chunk changes. The subsequent chunks retain their original boundaries and hashes, allowing 99% deduplication.

Conflict Resolution Strategy

When two users edit the same file offline and come online:

Last Write Wins (LWW): Simple but dangerous. The server keeps the version with the latest timestamp. User A’s work is overwritten/lost.
Conflicted Copy (Dropbox Approach): The server detects that the base_version of User B’s upload is older than the current_version in the DB.
- Instead of overwriting, the server accepts User B’s file but renames it: Resume (Conflicted Copy 2023-10-27).docx.
- Both versions are kept. The user must manually merge them. This preserves data integrity (Zero Data Loss).

Metadata Sharding Strategy

The Metadata DB grows to billions of rows. We must shard.

Shard by file_id: Good distribution, but listing a folder requires querying ALL shards (Scatter-Gather). Slow for “List Folder”.
Shard by user_id: All files for User A are on Shard 1.
Pros: Fast “List Folder”. ACID transactions within a user account are easy.
Cons: Hot Shard problem (if one user has 10M files). Sharing files across users (Cross-Shard) is complex.
Solution: We typically shard by user_id but use a separate “Sharing Service” to handle cross-user pointers.

Namespace Management (Flattening)

How do we store the file structure /User/Documents/Project/File.txt?

Naive Approach: Store absolute paths as strings.
Problem: Renaming the folder Documents to Docs requires updating strings for every single file inside (millions). This is O(N) write amplification.
Optimized Approach (Flattening): Store files by file_id and parent_id (Adjacency List).
Renaming: Just update the name of the folder row. The children point to the folder’s id, which never changes. O(1) operation.

CREATE TABLE file_metadata (
  id BIGINT PRIMARY KEY,
  parent_id BIGINT,  -- Points to the parent folder's ID
  name VARCHAR(255),
  is_folder BOOLEAN,
  -- Renaming 'Documents' only updates THIS single row.
  -- All children still point to this ID.
  INDEX(parent_id)
);

Metadata ACID Properties vs Block Storage

We have a hybrid consistency model:

Metadata (ACID): The file hierarchy must be strongly consistent. If I move a file, it must disappear from the old folder immediately. We use MySQL or Spanner.
Block Storage (Eventual): The actual bytes in S3 are immutable. We rely on the Metadata DB as the source of truth. If the metadata says “File A is at Hash X”, the client fetches Hash X.
We cannot use eventually consistent NoSQL (like Cassandra) for the file hierarchy, or a user might upload a file and not see it immediately.

8. Requirements Traceability

Requirement	Design Decision	Justification
Consistency	ACID (SQL) for Metadata	File hierarchy must be strongly consistent to prevent sync conflicts.
Bandwidth	Block-Level Deduplication	Saves 40-60% of bandwidth by not re-uploading duplicate content.
Reliability	S3 (11 9s)	Object stores are designed for massive durability.
Sync Speed	Long Polling / WebSocket	Push model is faster than Poll, ensuring near real-time updates.
Renaming	Namespace Flattening	Allows O(1) folder renaming by updating parent reference only.

9. Observability & Metrics

We use the RED Method (Rate, Errors, Duration) to monitor health.

Key Metrics

Upload Success Rate: (Successful Uploads / Total Attempts). Alert if < 99.9%.
Sync Latency: Time from “Client A Commit” to “Client B Notification”.
Deduplication Ratio: (Bytes Saved / Total Bytes). Measures algorithm efficiency.
S3 Throughput: Monitor ingress/egress to predict costs.

Distributed Tracing

Each file operation generates a TraceID.

Span 1: Client Upload (Chunk 1)
Span 2: Metadata Service (Commit)
Span 3: Notification Service (Fanout)
We can visualize this in Jaeger/Zipkin to find bottlenecks.

10. Deployment Strategy

Ring Deployment

We cannot risk breaking sync for 500M users.

Ring 0 (Internal): Dropbox Employees.
Ring 1 (Canary): 1% of users (Early Adopters).
Ring 2 (Global): 100% Rollout.

Client Updates

The desktop client is harder to update than the server.

Auto-Update: Background process checks for new versions.
Backward Compatibility: The server must support older client versions (API Versioning).

11. Interview Gauntlet

Rapid Fire Questions

Why not use a NoSQL DB for metadata? NoSQL is often eventually consistent. We need ACID to ensure file moves/renames are atomic and visible instantly to prevent sync conflicts.
How do you handle two users editing the same file offline? Conflict Resolution. Usually “Last Write Wins” or creating a “Conflicted Copy” (both files kept) for the user to merge manually.
What happens if the client crashes during upload? The client should retry. Since we use unique Hash IDs for chunks, re-uploading the same chunk is idempotent.
Why use Long Polling instead of WebSockets? WebSockets require maintaining stateful connections, which is expensive for millions of idle clients. Long Polling is stateless and easier to scale for notifications.
How does ‘Delta Sync’ differ from ‘Deduplication’? Deduplication works on the whole block level across users. Delta Sync works on the byte level for a single file using Rolling Hashes.

12. Interactive Decision Visualizer: Rolling Hash & Deduplication

This simulator demonstrates the difference between Fixed-Size Chunking and Rolling Hash (Variable-Size).

Sync: Upload the initial text.
Edit: Insert a character at the start (e.g., “A”).
Compare:
- Fixed: Every chunk boundary shifts. Hashes change. Zero Deduplication.
- Rolling: Only the first chunk changes. The rest match. High Deduplication.

[!TIP] Try it yourself: Click “Sync Initial”. Then type “X “ at the very start of the text box. Click “Sync Edit”. Watch the Rolling Hash catch up!

File Content

Fixed-Size Chunking (Every 5 chars)

Rolling Hash (Space Boundary)

> System Ready. Click 'Sync Initial' to start.

13. Summary (Whiteboard)

Scale & Capacity

5 Exabytes Storage (Cold Storage Critical)
Write-Heavy (Sync frequent small changes)
Metadata: ACID is Mandatory
Block Data: Immutable, Content-Addressed

Core Algorithms

Chunking: 4MB blocks (Deduplication)
Rabin-Karp: Rolling Hash (Delta Sync)
Flattening: O(1) Folder Rename
Merkle Tree: Fast Sync Verification

Trade-offs

Consistency > Availability for Metadata
Sync Speed vs Battery (Long Poll vs WS)
Storage Cost vs Latency (Tiered Storage)

Bottlenecks

Metadata DB (Sharding is hard)
Client Bandwidth (Upload speed)
Thundering Herd on Sync Notification