Citus: Turning Postgres into a Cluster

Standard Postgres scales vertically (bigger CPU, more RAM). But eventually, you hit the limits of a single machine (e.g., 128 cores, 4TB RAM).

Citus allows Postgres to scale horizontally by sharding your data across multiple nodes (machines), while still appearing as a single database to your application.

1. The Genesis: Why Shard?

If you have 10TB of data but only 1TB of RAM, most of your queries will hit the disk (slow). If you split that 10TB across 10 machines (1TB each), the entire dataset can fit in aggregate RAM (fast).

The Challenge: Joins If Table A is on Node 1 and Table B is on Node 2, joining them requires sending data over the network (slow).

The Solution: Co-location Citus ensures that related data (e.g., all orders for user_id: 123) lives on the same node. This means joins can happen locally on that node without network overhead.

2. Interactive: Sharding Simulator

See how Citus distributes rows based on a Shard Key (e.g., tenant_id).

  • Coordinator: Receives the query.
  • Workers: Store the actual data.
  • Router: Decides which worker gets the data based on Hash(shard_key) % num_shards.

Control Panel

Simulate inserting rows into a distributed table.

Coordinator: Top Node

Workers: Bottom Nodes

Current Distribution: Uniform

3. Architecture Patterns

A. Distributed Tables

Tables are sharded by a key. Large tables (Users, Orders, Events).

  • Write: Hashed to a specific shard/node.
  • Read: Routed to a single node (if filter includes shard key) or scatter-gathered (if it doesn’t).

B. Reference Tables

Small tables that are needed everywhere (Countries, Currencies, Product Categories).

  • Replication: These tables are replicated to all nodes.
  • Benefit: Local joins! You can join a Distributed Table (Orders) with a Reference Table (Currencies) without any network traffic.

4. Code Implementation

A. Sharding a Table

```sql -- 1. Create standard tables CREATE TABLE tenants ( id UUID PRIMARY KEY, name TEXT ); CREATE TABLE users ( id UUID, tenant_id UUID, email TEXT, PRIMARY KEY (id, tenant_id) -- Composite key required for co-location ); -- 2. Distribute the tables (Shard by tenant_id) SELECT create_distributed_table('tenants', 'id'); SELECT create_distributed_table('users', 'tenant_id'); -- Now, inserting into 'users' will automatically route data to the same node as 'tenants' -- based on the hash of tenant_id. ```

B. Multi-Tenant Query

When you run a query filtering by tenant_id, Citus routes it to a single worker node. This is the secret to scaling SaaS apps to millions of tenants.

```java public void getTenantUsers(Connection conn, UUID tenantId) throws SQLException { // The coordinator sees this WHERE clause and sends the query // ONLY to the node holding this tenant's data. String sql = "SELECT * FROM users WHERE tenant_id = ?"; try (PreparedStatement pstmt = conn.prepareStatement(sql)) { pstmt.setObject(1, tenantId); ResultSet rs = pstmt.executeQuery(); // ... process results ... } } ```