Citus: Turning Postgres into a Cluster
Standard Postgres scales vertically (bigger CPU, more RAM). But eventually, you hit the limits of a single machine (e.g., 128 cores, 4TB RAM).
Citus allows Postgres to scale horizontally by sharding your data across multiple nodes (machines), while still appearing as a single database to your application.
1. The Genesis: Why Shard?
If you have 10TB of data but only 1TB of RAM, most of your queries will hit the disk (slow). If you split that 10TB across 10 machines (1TB each), the entire dataset can fit in aggregate RAM (fast).
The Challenge: Joins If Table A is on Node 1 and Table B is on Node 2, joining them requires sending data over the network (slow).
The Solution: Co-location
Citus ensures that related data (e.g., all orders for user_id: 123) lives on the same node. This means joins can happen locally on that node without network overhead.
2. Interactive: Sharding Simulator
See how Citus distributes rows based on a Shard Key (e.g., tenant_id).
- Coordinator: Receives the query.
- Workers: Store the actual data.
- Router: Decides which worker gets the data based on
Hash(shard_key) % num_shards.
Control Panel
Simulate inserting rows into a distributed table.
Coordinator: Top Node
Workers: Bottom Nodes
Current Distribution: Uniform
3. Architecture Patterns
A. Distributed Tables
Tables are sharded by a key. Large tables (Users, Orders, Events).
- Write: Hashed to a specific shard/node.
- Read: Routed to a single node (if filter includes shard key) or scatter-gathered (if it doesn’t).
B. Reference Tables
Small tables that are needed everywhere (Countries, Currencies, Product Categories).
- Replication: These tables are replicated to all nodes.
- Benefit: Local joins! You can join a Distributed Table (Orders) with a Reference Table (Currencies) without any network traffic.
4. Code Implementation
A. Sharding a Table
B. Multi-Tenant Query
When you run a query filtering by tenant_id, Citus routes it to a single worker node. This is the secret to scaling SaaS apps to millions of tenants.