Postgres Glossary

[!NOTE] This module explores the core principles of Postgres Glossary, deriving solutions from first principles and hardware constraints to build world-class, production-ready expertise.

1. A

ACID

Atomicity, Consistency, Isolation, Durability. The set of properties that guarantee that database transactions are processed reliably.

2. B

B-Tree

The default index type in Postgres. Balanced Tree. Good for equality and range queries on sortable data types.

Bloat

Dead rows (tuples) that remain in tables and indexes after updates or deletes, occupying space and slowing down queries until reclaimed by VACUUM.

3. C

Continuous Aggregate

A feature in TimescaleDB. A materialized view that is automatically updated as new data is inserted into the underlying hypertable, making real-time aggregate queries instant.

Coordinator Node

In a Citus cluster, the node that your application connects to. It holds metadata about shards and distributes queries to worker nodes.

Covering Index

An index that contains all the columns required by a query, allowing Postgres to retrieve data directly from the index (Index-Only Scan) without visiting the heap (table storage).

CTE

Common Table Expression. A temporary result set which can be referenced within a SELECT, INSERT, UPDATE, or DELETE statement. Defined using the WITH clause.

4. D

Deadlock

A situation where two or more transactions are waiting for one another to release locks, preventing any of them from proceeding. Postgres detects this and aborts one transaction.

Dirty Read

An anomaly where a transaction reads data written by another concurrent transaction that has not yet committed. Postgres prevents this in all isolation levels.

5. E

Explicit Locking

Manually acquiring locks on database objects (e.g., LOCK TABLE, SELECT FOR UPDATE) to prevent concurrent access, as opposed to relying solely on MVCC.

6. G

GIN

Generalized Inverted Index. Designed for handling composite values like arrays and JSONB. Optimized for “contains” queries (e.g., finding documents where a key equals a value).

GiST

Generalized Search Tree. A framework for building balanced tree structures for arbitrary data types. Commonly used for geometric data (PostGIS) and full-text search.

7. H

Heap

The main storage area where Postgres table data (rows/tuples) is stored. Indexes contain pointers (TIDs) to locations in the heap.

Heap Fetch

The operation of visiting the heap to retrieve row data or verify visibility. Frequent heap fetches can degrade the performance of index scans.

HOT Update

Heap-Only Tuple Update. An optimization where an update does not require modifying indexes if the new tuple fits on the same page and no indexed columns are modified.

Hypertable

A virtual table in TimescaleDB that looks like a regular table but is physically partitioned into many smaller tables (chunks) by time and space.

8. I

Index-Only Scan

A scan method where Postgres retrieves all necessary data directly from the index without visiting the table heap. Requires a Covering Index and a clean Visibility Map.

Isolation Level

Controls how transaction integrity is visible to other users and systems. Postgres supports Read Committed (default), Repeatable Read, and Serializable.

9. J

JSONB

Binary JSON. A decomposed binary format for storing JSON data. It supports indexing and is faster to process than plain JSON.

10. L

Logical Replication

A method of replicating data objects and their changes based upon their replication identity (usually a primary key). It allows fine-grained control over data replication and security.

11. M

MVCC

Multi-Version Concurrency Control. A method used by Postgres to handle concurrent access to data without locking the database for readers. Each transaction sees a snapshot of data.

12. N

Non-Repeatable Read

An anomaly where a transaction reads the same row twice and gets different data because another transaction modified it in between.

13. P

Partial Index

An index built over a subset of a table, defined by a conditional expression (e.g., WHERE status = 'active'). Reduces index size and maintenance cost.

Patroni

A template for PostgreSQL High Availability. It manages a cluster of Postgres nodes using a distributed consensus store (like Etcd) to handle automatic failover and leader election.

Phantom Read

An anomaly where a transaction executes a query returning a set of rows twice and gets different sets of rows because another transaction inserted new rows in between.

Physical Replication

A method of replication where the exact block-level changes (WAL) are shipped from the primary to the standby. It creates an exact copy of the entire database cluster.

Publisher

In Logical Replication, the node that defines a set of tables (publication) to be replicated to other nodes.

14. Q

Quorum

The minimum number of members of a distributed system that must agree for a decision (like leader election) to be valid. Prevents split-brain scenarios.

15. R

Recursive CTE

A CTE that references itself. Useful for querying hierarchical data like organizational charts or graph structures.

«««< HEAD

Replication Slot

A mechanism on the primary server that ensures WAL segments are not removed until they have been successfully received by all connected standbys.

16. S

Split-Brain

A failure scenario in a clustered system where network partitioning leads to two or more nodes believing they are the primary, potentially causing data corruption.

SSI

Serializable Snapshot Isolation. A technique used by Postgres to implement the Serializable isolation level, detecting serialization anomalies by tracking read/write dependencies without locking everything.

Subscriber

In Logical Replication, the node that receives data from a publisher and applies the changes to its local tables. =======

Reference Table

A small table in Citus that is replicated to all worker nodes. Allows local joins between large distributed tables and small lookup tables.

17. S

Shard

A horizontal partition of data in a distributed database. In Citus, a shard is a standard Postgres table on a worker node holding a subset of rows.

Spatial Index

An index optimized for spatial queries (e.g., “Find points within this box”). PostGIS typically uses GiST indexes implementing R-Trees.

origin/feat-postgres-extensions-1314213636004454745

18. T

TID

Tuple Identifier. A pointer consisting of a block number and an offset, used by indexes to locate a specific row version in the heap.

TOAST

The Oversized-Attribute Storage Technique. A mechanism for storing large column values (like long text or JSONB) separately from the main table row to keep pages compact.

Tuple

The physical representation of a row version in a PostgreSQL page. Due to MVCC, multiple tuples may exist for the same logical row.

19. V

VACUUM

A maintenance process that reclaims storage occupied by dead tuples and updates the Visibility Map.

Visibility Map

A bitmap that tracks which pages in a table contain only tuples that are visible to all active transactions. Crucial for Index-Only Scans.

20. W

WAL

Write-Ahead Log. A log of changes ensuring data integrity. Changes are written to the WAL before being applied to data files.

WAL Receiver

A background process on the Standby server that receives WAL data from the Primary and writes it to the local WAL files.

WAL Sender

A background process on the Primary server that sends WAL data to Standby servers or logical decoding plugins.

Window Function

A function that performs a calculation across a set of table rows that are somehow related to the current row. Unlike regular aggregate functions, window functions do not cause rows to become grouped into a single output row.

«««< HEAD

Write Skew

A serialization anomaly where two transactions read overlapping data but update disjoint data based on what they read, violating consistency constraints. Only prevented by Serializable isolation.

21. X

xmax

A hidden system column in a tuple header that stores the Transaction ID (XID) of the transaction that deleted (or updated) the row version.

xmin

A hidden system column in a tuple header that stores the Transaction ID (XID) of the transaction that created (inserted) the row version. =======

Worker Node

In a Citus cluster, the nodes that store the actual data (shards) and execute the query fragments sent by the coordinator.

origin/feat-postgres-extensions-1314213636004454745