TimescaleDB: The Time Machine
Postgres is an amazing general-purpose database. But when you insert millions of metrics per second (IoT, financial ticks, logs), the standard B-Tree index becomes a bottleneck, and deleting old data (DELETE FROM table WHERE time < X) causes massive Table Bloat.
TimescaleDB solves this by automatically partitioning your data across time and space, while still looking like a single standard SQL table.
1. The Genesis: The “DELETE” Problem
In a standard Postgres table, deleting old data is expensive.
- MVCC: Postgres doesn’t physically delete rows immediately. It marks them as “dead”.
- Vacuum: The Autovacuum process must scan the table to reclaim space.
- Fragmentation: This leaves the heap and indexes fragmented (Swiss cheese holes), slowing down future inserts.
The Solution: Hypertables (Partitioning) Instead of one giant table, TimescaleDB creates a Hypertable. Physically, it splits data into small “chunks” based on time ranges (e.g., 1 day per chunk).
To delete old data, you don’t run DELETE. You simply drop the file (the chunk).
DROP TABLE chunk_2023_01_01; is O(1) (instant) and creates zero bloat.
2. Interactive: Hypertable Chunk Visualizer
Visualize how data flows into chunks. As time progresses (X-axis), new chunks are created. When a chunk becomes “old” (based on retention policy), it is dropped instantly.
Control Panel
Simulate data ingestion over time.
Green Blocks: Active Chunks (Hot)
Blue Blocks: Compressed Chunks (Cold)
Grey Blocks: Dropped Chunks
Total Chunks: 0
3. Hardware Reality: Compression
Time-series data is highly repetitive.
- Timestamp:
2023-10-01 10:00:01,2023-10-01 10:00:02(Delta encoding wins here) - Device ID:
sensor-1,sensor-1(Dictionary encoding wins here) - Value:
20.5,20.6(XOR / Gorilla encoding wins here)
TimescaleDB uses columnar compression on older chunks. Instead of storing row-by-row (Postgres default), it pivots data to column-by-column, applying specific algorithms like Gorilla (floating point) and Delta-of-Delta (timestamps). Result: 90-95% storage savings.
4. Code Implementation
A. Creating a Hypertable
B. Continuous Aggregates
Real-time dashboards need “Average temperature per hour”, not raw data. Standard GROUP BY is slow on billions of rows. Continuous Aggregates pre-calculate this.