Welcome to the System Design course at A Dev Writes. This is a rigorous, Staff-level curriculum designed to teach you how to architect scalable, reliable distributed systems.

Unlike typical interview prep guides, this course emphasizes mental models and architectural reasoning—the ability to spot hidden couplings, reason about failure modes, and make principled trade-offs under ambiguity.


Module 0: Orientation & Fundamentals

Before diving into technical components, we build the conceptual foundation. Learn to think in terms of trade-offs, not absolutes.

Topics Covered:

  • The Architect’s Mindset & The “Crux” of a Design
  • System Qualities: Availability, Resiliency, Scalability
  • Percentiles (p50/p99), SLOs, and Error Budgets
  • Capacity Planning & Little’s Law
  • Operational Design: Blast Radius, Fate Sharing, Coupling
  • Reliability Primitives: Idempotency, Retries, Circuit Breaking

👉 Start Module 0: Orientation & Fundamentals


Module 1: Scalability & Traffic Management

Learn how to distribute and manage traffic across hundreds of servers at scale.

Topics Covered:

  • Load Balancing: L4 vs L7, DSR, Anycast
  • Consistent Hashing: Maglev, Rendezvous, The Ring
  • Rate Limiting & Backpressure: Token bucket, adaptive limits
  • Service Discovery: xDS API, Client-side vs Server-side LB

👉 Start Module 1: Scalability & Traffic Management


Module 2: Networking, Protocols & Failure Modes

Understand how communication protocols behave at scale, and how networks fail.

Topics Covered:

  • Modern Networking: TCP Tuning, BBR, QUIC/HTTP3
  • Protocol Complexities: HTTP/2, gRPC, WebSockets under LB
  • Network Partitions & Split-Brain: Quorum systems, fencing

👉 Start Module 2: Networking, Protocols & Failure Modes


Module 3: Caching & Delivery

Learn how to reduce latency and handle traffic spikes through intelligent caching strategies.

Topics Covered:

  • Caching Strategies: Cache-Aside, Write-Through
  • The Thundering Herd Problem
  • Bloom Filters & Probabilistic Data Structures

👉 Start Module 3: Caching & Delivery


Module 4: Databases & Scaling

Move beyond SQL vs. NoSQL. Understand how data is stored, replicated, and partitioned at scale.

Topics Covered:

  • Replication Models: Leader-Follower, Multi-Leader, Leaderless (Dynamo)
  • Sharding & Partitioning: Range vs Hash partitioning, Virtual Nodes
  • Consistency Models: CAP, PACELC, Serializability vs Eventual Consistency

👉 Start Module 4: Databases & Scaling


Module 5: Async & Queueing

Learn how to decouple systems for resilience using message queues and event-driven patterns.

Topics Covered:

  • Message Queues: Kafka vs RabbitMQ, Partitioning internals
  • Delivery Guarantees: At-least-once, Exactly-once, Transactional Outbox
  • Event-Driven Architecture: Choreography vs Orchestration, Saga Pattern
  • Backpressure & Flow Control: Load Shedding, Reactive Streams

👉 Start Module 5: Async & Queueing


Module 6: Core Reliability Patterns

Isolate failure and build anti-fragile systems using advanced isolation strategies.

Topics Covered:

  • Circuit Breakers & Bulkheads: Isolating cascading failures
  • Retry Storms: Exponential backoff, Jitter, Retry Budgets
  • Multi-Tenancy & Shuffle Sharding: Blast radius isolation

👉 Start Module 6: Core Reliability Patterns


Module 7: Advanced Distributed Patterns

Coordinate state across global boundaries without sacrificing performance or consistency.

Topics Covered:

  • Consensus: Raft, Paxos, and membership changes
  • Distributed Time: Hybrid Logical Clocks (HLC) vs Google TrueTime
  • Distributed Transactions: Try-Confirm-Cancel (TCC) vs Sagas

👉 Start Module 7: Advanced Distributed Patterns


Module 8: Ops, Observability & Security

Operate global systems with confidence through deep visibility and identity-based security.

Topics Covered:

  • Observability: Tail Sampling, High-Cardinality Metrics, Exemplars
  • Security: Zero Trust principles, mTLS, SPIFFE identity
  • Deployment & Cost: Blue/Green vs Canary, Cloud Storage Economics (S3)

👉 Start Module 8: Ops, Observability & Security


Real-World Case Studies

Apply the fundamentals to practical system design problems.

Available Case Study:

  • End-to-End Example: TinyURL

👉 Browse Case Studies


Course Philosophy

This course is built on three principles:

  1. Mental Models over Memorization: We teach you how to think about systems, not just what components exist.
  2. Failure-First Design: Every architectural decision is evaluated through the lens of “what breaks here, and how do we contain it?”
  3. Staff-Level Rigor: We go beyond surface-level explanations to cover coupling dimensions, fate sharing, and control-plane thinking.

Ready to begin? Start with Module 0 to build your foundation.