02. The Hardware Model of Computation

[!NOTE] This module explores the core principles of 02. The Hardware Model of Computation, deriving solutions from first principles and hardware constraints to build world-class, production-ready expertise.

1. Algorithms are Physical

In textbooks, algorithms are pure math. In reality, they are physical processes running on silicon.

Modern CPUs are incredibly fast, but memory is slow. This gap is known as the Memory Wall. Most of the time, your high-performance code isn’t limited by how many calculations the CPU can do, but by how fast the CPU can get data from RAM.


2. The Memory Hierarchy

To hide the slowness of RAM, CPUs use a hierarchy of caches.

  1. L1 Cache: Extremely fast, but tiny (KB). Right next to the CPU core.
  2. L2/L3 Cache: Slower than L1, but larger (MB). Shared across cores.
  3. RAM (Main Memory): Massive (GB), but “miles away” in terms of cycles.

The Latency Gap (Analogy)

  • L1 Cache: Like grabbing a book from your desk (1 second).
  • L2 Cache: Like walking to a bookshelf in your room (5 seconds).
  • RAM: Like walking to the city library in the rain (2 minutes).

[!IMPORTANT] A “Cache Miss” (when data isn’t in L1/L2) can slow down your algorithm by 10x to 100x, even if the Big O complexity remains the same!


3. Interactive: The Cache Locality Simulator

Watch how the CPU accesses data. Accessing contiguous data (like an Array) results in “Cache Hits” because the CPU fetches chunks of memory at once. Accessing scattered data (like a Linked List) results in “Cache Misses”.

Cache Hits: 0
Cache Misses: 0

4. Hardware Truths for Developers

Rule 1: Favor Contiguous Memory

If you have a choice between an ArrayList (contiguous) and a LinkedList (scattered), the ArrayList will almost always be faster for iteration because it is Cache Friendly.

Rule 2: Predictable Access Patterns

CPUs have “Hardware Prefetchers”. If you access memory at 0, 1, 2, 3, the CPU predicts you want 4, 5, 6 and fetches it before you even ask. If you jump randomly, the prefetcher fails.

Rule 3: Data Alignment

Java and Go handle this differently.

  • Go: Gives you more control over memory layout (structs can be allocated contiguously).
  • Java: Every object is a separate heap allocation, which can lead to “Pointer Chasing” and cache pressure.

Next Steps

Now that we understand the hardware, let’s learn how to formally prove that our algorithms work correctly across all inputs.