Advanced SQL & Internals

Before mastering advanced queries, you must understand how the database engine thinks. It does not read your SQL from top to bottom like a script. It follows a strict logical execution order that dictates everything from performance to variable scope.


1. Advanced SQL & Internals

2. The Logical Order of Operations

You write SQL in English order: SELECT ... FROM ... WHERE. But the Database executes it differently. This mismatch is the source of 90% of syntax errors for beginners.

The Golden Rule: The database builds the result set layer by layer.

SQL Execution Pipeline

Click "Next Step" to see how the result set is built.

1. FROM / JOIN
Build Working Set (Cartesian Product)
2. WHERE
Filter Rows
3. GROUP BY
Aggregate into Buckets
4. HAVING
Filter Groups
5. SELECT
Pick Columns & Calculate Expressions
6. ORDER BY / LIMIT
Sort and Cut Off
SELECT user, SUM(amt) FROM sales WHERE amt > 10 GROUP BY user HAVING SUM(amt) > 50 ORDER BY 2 DESC;
Press Start...

[!TIP] Why this matters: You cannot use an alias defined in SELECT inside the WHERE clause (e.g., WHERE total > 100 fails if total is calculated in SELECT), because WHERE runs before SELECT.


3. Advanced SQL & Internals

4. ACID Properties

The guarantee that your data is safe. Without these, a database is just a fancy file system.

  • Atomicity: All or Nothing. If the transaction fails halfway (e.g., power cut), it rolls back completely.
  • Consistency: Data must follow rules (Constraints, Foreign Keys) before and after. The DB is never in an invalid state.
  • Isolation: Transactions shouldn’t mess with each other. This is handled by MVCC.
  • Durability: Once committed, it survives a power outage. This is guaranteed by the WAL.

5. Advanced SQL & Internals

6. MVCC (Multiversion Concurrency Control)

How does Postgres allow reading a table while someone else is writing to it? Answer: MVCC.

Instead of locking the row and making everyone wait, Postgres creates a new version of the row for every update. It keeps multiple versions of the “same” row alive simultaneously.

Tuple Versioning (xmin, xmax)

When you update a row, Postgres doesn't overwrite it. It marks the old one as "expired" (set xmax) and inserts a new one.

Dead Tuple
ID: 1 | Bal: 50
xmin: 100 (Created by Tx 100) xmax: 101 (Deleted by Tx 101)
Live Tuple
ID: 1 | Bal: 100
xmin: 101 (Created by Tx 101) xmax: 0 (Live)
Visibility Rules:
  • Tx 99 sees: Balance 50 (Because Tx 101 hasn't happened yet for it).
  • Tx 102 sees: Balance 100 (If Tx 101 committed).

Vacuuming: The Cost of MVCC

Since updates create new versions, the old versions (“dead tuples”) pile up and bloat the table. VACUUM is the garbage collector that cleans them up to reclaim space.


7. Advanced SQL & Internals

8. Interactive: Transaction Isolation Visualizer

Visualize how Read Committed isolation works. This is the default in Postgres.

Transaction Visualization

Txn A (Writer)

BEGIN; UPDATE accounts SET balance = 100; COMMIT;

Txn B (Reader)

Isolation: READ COMMITTED
Balance: 50 (Original)

9. Advanced SQL & Internals

[!NOTE] This module explores the core principles of Advanced SQL & Internals, deriving solutions from first principles and hardware constraints to build world-class, production-ready expertise.

10. Summary

  • Execution ≠ Writing: SQL execution order is logical (FROM → WHERE → SELECT).
  • ACID: The bedrock of reliable data.
  • MVCC: High concurrency without locking readers.
  • Vacuum: The necessary maintenance cost of MVCC to remove “dead tuples”.

Next, we’ll dive into Window Functions, which operate after the HAVING clause but before ORDER BY.