Isolation Levels: Controlling Chaos

In a busy database, thousands of transactions happen every second. Without Isolation, these transactions would step on each other’s toes, leading to data corruption.

1. The Concurrency Anomalies

To understand Isolation, you must first understand the bugs it prevents.

A. Dirty Read

  • Scenario: Transaction A writes a value but hasn’t committed yet. Transaction B reads that uncommitted value.
  • Danger: If A rolls back, B is working with data that “never existed”.

B. Non-Repeatable Read

  • Scenario: Transaction A reads a row (Balance: $100). Transaction B updates it to $200 and commits. Transaction A reads it again and sees $200.
  • Problem: Math is broken. “I just checked, and it was $100!”

C. Phantom Read

  • Scenario: Transaction A runs SELECT * FROM Users WHERE Age > 18 (Gets 10 rows). Transaction B inserts a new 20-year-old user. Transaction A runs the same query again and gets 11 rows.
  • Problem: “Where did this ghost come from?”
  • Note: Non-Repeatable Read is about modifying existing rows. Phantom Read is about adding/removing rows.

2. Deep Dive: Write Skew (The Silent Killer)

There is a 4th anomaly that most people forget, but it causes huge bugs in Snapshot Isolation (used by Repeatable Read).

The Scenario: The “Doctor On Call” Problem

  • Rule: At least one doctor must be on call at the hospital.
  • Current State: Alice (On Call), Bob (On Call).
  • The Goal: Both Alice and Bob want to leave (go Off Call).

The Race Condition:

  1. Alice (Txn A) checks the schedule. She sees 2 doctors on call. “Okay, 2 > 1. I can leave.”
  2. Bob (Txn B) checks the schedule at the same time. He also sees 2 doctors on call (because Alice hasn’t committed yet). “Okay, 2 > 1. I can leave.”
  3. Alice updates her status to “Off Call” and commits.
  4. Bob updates his status to “Off Call” and commits.
  5. Result: Zero doctors on call. The invariant is broken.

This is Write Skew. Both transactions read the same data, made a decision based on it, but their updates were disjoint (different rows), so the database didn’t detect a direct conflict.


3. The Four Levels (SQL Standard)

Databases let you choose your safety level. Higher safety = Lower performance.

Isolation Level Dirty Read? Non-Repeatable? Phantom? Write Skew? Performance
Read Uncommitted ✅ Yes ✅ Yes ✅ Yes ✅ Yes 🚀 Fastest
Read Committed ❌ No ✅ Yes ✅ Yes ✅ Yes ⚡ Fast (Default: Postgres)
Repeatable Read ❌ No ❌ No ✅ Yes ✅ Yes 🐢 Moderate (Default: MySQL)
Serializable ❌ No ❌ No ❌ No ❌ No 🐌 Slowest

Deep Dive: MVCC (Multi-Version Concurrency Control)

Most modern databases (Postgres, MySQL InnoDB) achieve isolation without locking readers using MVCC.

The Core Idea: Instead of overwriting data, create a new version.

Timeline: User Balance Updates

T1: BEGIN TRANSACTION (TxID = 100)
    Row: [id=1, balance=$100, created_by_tx=90, visible_to_tx=∞]

T2: UPDATE balance = $80 (TxID = 100)
    Old Row: [id=1, balance=$100, created_by_tx=90, visible_to_tx=100]  ← Still exists!
    New Row: [id=1, balance=$80, created_by_tx=100, visible_to_tx=∞]  ← New version

T3: Another transaction (TxID = 99) reads
    → Sees $100 (because 99 < 100, it reads the old version)

T4: COMMIT (TxID = 100)
    → Now all new transactions (TxID ≥ 101) see $80

Version Chain: Each row can have multiple versions linked by transaction IDs.

Postgres vs MySQL MVCC Implementation

Aspect Postgres MySQL (InnoDB)
Storage Old versions stored in-place (TOAST for large) Old versions in Undo Log
Read Overhead Low (Direct tuple access) Medium (May need undo log lookup)
Garbage Collection VACUUM (Background process) Purge Thread (Automatic)
Version Visibility Based on snapshot XID Based on read view
Write Performance Slower (In-place bloat) Faster (Undo log is sequential)
Snapshot Creation Per-transaction (Copy XID list) Per-statement (Lightweight)

Interview Insight: Postgres VACUUM is notorious for causing performance issues if not tuned. MySQL’s purge thread is more predictable.


How Serializable Works (The Nuclear Option)

  1. 2PL (Two-Phase Locking): The classic approach. It locks everything you read. If you read 10 rows, no one else can touch them. (Very slow).
  2. SSI (Serializable Snapshot Isolation): The modern approach (Postgres). It uses MVCC but tracks “dependencies”. If it detects a conflict (like the Doctor scenario), it aborts one transaction. It is optimistic and much faster than 2PL.

4. Interactive Demo: Isolation Arena

Experience the concurrency bugs first-hand. Select a mode to simulate specific anomalies.

Isolation Level: Read Uncommitted
Txn B reads uncommitted data from Txn A. If A rolls back, B holds invalid data.
ID: 1 | Name: Alice | Balance: $100
Txn A (Writer)
Txn B (Reader)
Result: -
Ready.

5. Case Study: E-commerce Inventory (Lost Updates)

How do massive sites like Amazon prevent selling 1 item to 2 people? (See also Flash Sales).

The Problem

Item X has Quantity = 1.

  1. User A adds to cart. System checks inventory: 1. OK.
  2. User B adds to cart. System checks inventory: 1. OK.
  3. User A checks out. UPDATE items SET qty = 0 WHERE id = X.
  4. User B checks out. UPDATE items SET qty = 0 WHERE id = X.
  5. Result: 2 Items Sold. 1 in stock. Overselling.

The Solution: Isolation Levels vs Locking

Approach A: Pessimistic Locking (SELECT FOR UPDATE)

Lock the row when reading.

BEGIN;
SELECT * FROM items WHERE id = 'X' FOR UPDATE;
-- User B blocks here until User A commits.
UPDATE items SET qty = 0 WHERE id = 'X';
COMMIT;
  • Pros: 100% Safe.
  • Cons: Slow. Kills concurrency.

Approach B: Optimistic Locking (Versioning)

Add a version column.

-- User A reads: Qty=1, Ver=5
-- User B reads: Qty=1, Ver=5

-- User A updates:
UPDATE items SET qty=0, ver=6 WHERE id='X' AND ver=5;
-- Success (1 row updated).

-- User B updates:
UPDATE items SET qty=0, ver=6 WHERE id='X' AND ver=5;
-- FAIL (0 rows updated) because version is now 6!
  • Pros: Fast. No locks.
  • Cons: User B gets an error (“Sorry, item sold out”).

6. Summary

  • Read Committed: Good balance. No Dirty Reads.
  • Repeatable Read: Stronger. Guarantees you see the same data throughout your transaction (Snapshots).
  • Serializable: The nuclear option. Uses 2PL or SSI. Use for Financial ledgers.
  • Write Skew: A dangerous bug in Repeatable Read/Snapshot Isolation where concurrent checks pass but the final state is invalid.