If you’re building a distributed cache or database, the most critical decision you make is: “Which node holds Key K?”
The naive approach node = hash(key) % N works fine… until N changes. When you add or remove a server, almost 100% of keys get remapped. In a caching layer, this is a Cache Stampede; your database goes down immediately.
Enter Consistent Hashing: a strategy where adding/removing a slot only affects 1/N keys.
1. The Ring Topology (Visualized)
Imagine mapping both your Servers and your Keys onto a circle (range $0 \dots 2^{32}-1$). To find where a key lives, you simply move clockwise until you hit a server.
Nodes: 3 | Keys: 20 | Churn Analysis: --
2. Staff-Level: Virtual Nodes (Vnodes)
A simple ring has a fatal flaw: Uneven Distribution. If you have 3 nodes, and by random chance they land close to each other, one node might own 10% of the ring while another owns 80%. This creates a Hot Spot.
The Solution: Virtual Nodes Instead of mapping a physical node to one point on the ring, we map it to hundreds of points (vnodes).
- Node A becomes
A_1, A_2, ... A_100. - Benefits:
- Uniformity: Statistically guarantees even load distribution.
- Heterogeneity: Give a powerful server 200 vnodes and a weak server 50.
3. Beyond the Ring: Maglev vs. Rendezvous
While the “Ring” is the classic visualization, Staff engineers often choose more specialized hashing algorithms depending on their constraints.
Maglev Hashing (Google)
Used in Google’s “Maglev” load balancer. Instead of a ring, it uses a Permutation Table.
- Pros: Extremely fast lookups (O(1)). Excellent load distribution.
- Cons: Higher memory usage to store the permutation table.
- Use Case: High-performance L4 load balancers handling millions of connections.
Rendezvous Hashing (HRW)
Also known as Highest Random Weight hashing. For every key, you calculate a “weight” for every server: $score = hash(key + server_id)$. The server with the highest score wins.
- Pros: Zero memory needed for a ring/table. Perfect stability (adding a node doesn’t move keys between existing nodes at all).
- Cons: Lookup is O(N) where N is the number of servers (though optimized versions exist).
- Use Case: Distributed caches where memory is at a premium, or when the number of servers is small (< 100).
| Feature | Ring Hashing | Maglev | Rendezvous |
|---|---|---|---|
| Lookup Time | O(log V) | O(1) | O(N) |
| Memory | O(V) | O(M) | O(1) |
| Rebalance | Minimal | Minimal | Perfect |
(V = vnodes, N = servers, M = table size)
4. Implementation Logic
Implementing this efficiently requires a data structure that supports fast floor/ceiling lookups (searching for the “next highest” hash).
Java: TreeMap
Java’s TreeMap is a Red-Black tree. We can use tailMap(key) to find the portion of the ring greater than or equal to our key.
import java.util.TreeMap;
import java.util.SortedMap;
public class ConsistentHashing {
// Ring stores: Hash -> ServerName
private final TreeMap<Integer, String> ring = new TreeMap<>();
private final int numberOfReplicas; // Vnodes per server
public ConsistentHashing(int numberOfReplicas) {
this.numberOfReplicas = numberOfReplicas;
}
public void addServer(String server) {
for (int i = 0; i < numberOfReplicas; i++) {
// Create vnode: "ServerA-1", "ServerA-2"...
int hash = (server + "-" + i).hashCode();
ring.put(hash, server);
}
}
public String getServer(String key) {
if (ring.isEmpty()) return null;
int hash = key.hashCode();
// Find the first node >= hash (Clockwise search)
if (!ring.containsKey(hash)) {
SortedMap<Integer, String> tailMap = ring.tailMap(hash);
hash = tailMap.isEmpty() ? ring.firstKey() : tailMap.firstKey();
}
return ring.get(hash);
}
}
Go: sort.Search
Go doesn’t have a built-in BST like TreeMap. Instead, we use a Sorted Slice and binary search.
package main
import (
"hash/crc32"
"sort"
"strconv"
)
type HashRing struct {
ring []int // Sorted Keys
nodes map[int]string // Map Hash -> Node Name
replicaCnt int
}
func New(replicaCnt int) *HashRing {
return &HashRing{
nodes: make(map[int]string),
replicaCnt: replicaCnt,
}
}
func (h *HashRing) AddNode(node string) {
for i := 0; i < h.replicaCnt; i++ {
// Unique key for vnode
key := node + strconv.Itoa(i)
hash := int(crc32.ChecksumIEEE([]byte(key)))
h.ring = append(h.ring, hash)
h.nodes[hash] = node
}
sort.Ints(h.ring) // Keep ring sorted for O(log N) search
}
func (h *HashRing) Get(key string) string {
if len(h.ring) == 0 {
return ""
}
hash := int(crc32.ChecksumIEEE([]byte(key)))
// Binary Search for first index where ring[i] >= hash
idx := sort.Search(len(h.ring), func(i int) bool {
return h.ring[i] >= hash
})
// Wrap around if we hit the end
if idx == len(h.ring) {
idx = 0
}
return h.nodes[h.ring[idx]]
}
5. Replication Strategies
In systems like Cassandra/Dynamo, we don’t just want ONE node. We want N replicas.
Strategy: Walk the ring. Once you find the “Coordinator Node” (the primary owner), continue walking clockwise and pick the next $N-1$ distinct physical nodes.
[!TIP] Rack Awareness: Simply picking the next N nodes might put all replicas in the same Rack (if vnodes land close). Production systems skip nodes that belong to the same failure domain.
Next up: Now that we can distribute data, how do we keep it fast? Module 2: Caching & Delivery.