Caching Applications

A Cache is a high-speed data storage layer that stores a subset of data, so that future requests for that data are served up faster.

The most common caching eviction policy is LRU (Least Recently Used):

If the cache is full, remove the item that hasn’t been used for the longest time.
If an item is accessed (read or written), move it to the “Most Recently Used” position.

The Challenge: O(1) Everything

We need a data structure that supports:

Get(key): O(1)
Put(key, value): O(1)
Delete(key): O(1) (for eviction)
Maintain Order: Know which item is oldest.

Solution: Combine a Hash Map with a Doubly Linked List.

Hash Map: Maps Key → Node. Gives O(1) access.
Doubly Linked List: Maintains order. Head = Most Recent, Tail = Least Recent. Moving a node to the head is O(1).

Interactive LRU Simulator

Capacity: 3 items. Watch how the order changes as you access items. The item at the bottom (Tail) is the next to be evicted!

LRU Cache (Size 3)

Cache is empty.

HASH MAP (Key → Node)

LINKED LIST (Head=MRU, Tail=LRU)

Implementation Code

Using Python’s OrderedDict, an LRU Cache is trivial (it maintains insertion order automatically).

from collections import OrderedDict

class LRUCache:
  def __init__(self, capacity: int):
    self.cache = OrderedDict()
    self.capacity = capacity

  def get(self, key: int) -> int:
    if key not in self.cache:
      return -1
    # Move to end (MRU)
    self.cache.move_to_end(key)
    return self.cache[key]

  def put(self, key: int, value: int) -> None:
    if key in self.cache:
      self.cache.move_to_end(key)
    self.cache[key] = value

    if len(self.cache) > self.capacity:
      # Pop first item (LRU)
      self.cache.popitem(last=False)

In a pure “Design” interview, you would implement the Doubly Linked List manually to show you understand the pointers.

Deep Dive Strategy Lab: Caching Applications

Intuition Through Analogy

Think of this chapter like library card catalog lookup. The goal is not to memorize a fixed trick, but to repeatedly answer:

What is the bottleneck?
Which constraint dominates (time, memory, latency, correctness)?
Which representation makes the bottleneck easier to eliminate?