Postgres Glossary
Postgres Glossary
[!NOTE] This module explores the core principles of Postgres Glossary, deriving solutions from first principles and hardware constraints to build world-class, production-ready expertise.
1. A
ACID
Atomicity, Consistency, Isolation, Durability. The set of properties that guarantee that database transactions are processed reliably.
2. B
B-Tree
The default index type in Postgres. Balanced Tree. Good for equality and range queries on sortable data types.
Bloat
Dead rows (tuples) that remain in tables and indexes after updates or deletes, occupying space and slowing down queries until reclaimed by VACUUM.
3. C
Continuous Aggregate
A feature in TimescaleDB. A materialized view that is automatically updated as new data is inserted into the underlying hypertable, making real-time aggregate queries instant.
Coordinator Node
In a Citus cluster, the node that your application connects to. It holds metadata about shards and distributes queries to worker nodes.
Covering Index
An index that contains all the columns required by a query, allowing Postgres to retrieve data directly from the index (Index-Only Scan) without visiting the heap (table storage).
CTE
Common Table Expression. A temporary result set which can be referenced within a SELECT, INSERT, UPDATE, or DELETE statement. Defined using the WITH clause.
4. D
Deadlock
A situation where two or more transactions are waiting for one another to release locks, preventing any of them from proceeding. Postgres detects this and aborts one transaction.
Dirty Read
An anomaly where a transaction reads data written by another concurrent transaction that has not yet committed. Postgres prevents this in all isolation levels.
5. E
Explicit Locking
Manually acquiring locks on database objects (e.g., LOCK TABLE, SELECT FOR UPDATE) to prevent concurrent access, as opposed to relying solely on MVCC.
6. G
GIN
Generalized Inverted Index. Designed for handling composite values like arrays and JSONB. Optimized for “contains” queries (e.g., finding documents where a key equals a value).
GiST
Generalized Search Tree. A framework for building balanced tree structures for arbitrary data types. Commonly used for geometric data (PostGIS) and full-text search.
7. H
Heap
The main storage area where Postgres table data (rows/tuples) is stored. Indexes contain pointers (TIDs) to locations in the heap.
Heap Fetch
The operation of visiting the heap to retrieve row data or verify visibility. Frequent heap fetches can degrade the performance of index scans.
HOT Update
Heap-Only Tuple Update. An optimization where an update does not require modifying indexes if the new tuple fits on the same page and no indexed columns are modified.
Hypertable
A virtual table in TimescaleDB that looks like a regular table but is physically partitioned into many smaller tables (chunks) by time and space.
8. I
Index-Only Scan
A scan method where Postgres retrieves all necessary data directly from the index without visiting the table heap. Requires a Covering Index and a clean Visibility Map.
Isolation Level
Controls how transaction integrity is visible to other users and systems. Postgres supports Read Committed (default), Repeatable Read, and Serializable.
9. J
JSONB
Binary JSON. A decomposed binary format for storing JSON data. It supports indexing and is faster to process than plain JSON.
10. L
Logical Replication
A method of replicating data objects and their changes based upon their replication identity (usually a primary key). It allows fine-grained control over data replication and security.
11. M
MVCC
Multi-Version Concurrency Control. A method used by Postgres to handle concurrent access to data without locking the database for readers. Each transaction sees a snapshot of data.
12. N
Non-Repeatable Read
An anomaly where a transaction reads the same row twice and gets different data because another transaction modified it in between.
13. P
Partial Index
An index built over a subset of a table, defined by a conditional expression (e.g., WHERE status = 'active'). Reduces index size and maintenance cost.
Patroni
A template for PostgreSQL High Availability. It manages a cluster of Postgres nodes using a distributed consensus store (like Etcd) to handle automatic failover and leader election.
Phantom Read
An anomaly where a transaction executes a query returning a set of rows twice and gets different sets of rows because another transaction inserted new rows in between.
Physical Replication
A method of replication where the exact block-level changes (WAL) are shipped from the primary to the standby. It creates an exact copy of the entire database cluster.
Publisher
In Logical Replication, the node that defines a set of tables (publication) to be replicated to other nodes.
14. Q
Quorum
The minimum number of members of a distributed system that must agree for a decision (like leader election) to be valid. Prevents split-brain scenarios.
15. R
Recursive CTE
A CTE that references itself. Useful for querying hierarchical data like organizational charts or graph structures.
«««< HEAD
Replication Slot
A mechanism on the primary server that ensures WAL segments are not removed until they have been successfully received by all connected standbys.
16. S
Split-Brain
A failure scenario in a clustered system where network partitioning leads to two or more nodes believing they are the primary, potentially causing data corruption.
SSI
Serializable Snapshot Isolation. A technique used by Postgres to implement the Serializable isolation level, detecting serialization anomalies by tracking read/write dependencies without locking everything.
Subscriber
In Logical Replication, the node that receives data from a publisher and applies the changes to its local tables. =======
Reference Table
A small table in Citus that is replicated to all worker nodes. Allows local joins between large distributed tables and small lookup tables.
17. S
Shard
A horizontal partition of data in a distributed database. In Citus, a shard is a standard Postgres table on a worker node holding a subset of rows.
Spatial Index
An index optimized for spatial queries (e.g., “Find points within this box”). PostGIS typically uses GiST indexes implementing R-Trees.
origin/feat-postgres-extensions-1314213636004454745
18. T
TID
Tuple Identifier. A pointer consisting of a block number and an offset, used by indexes to locate a specific row version in the heap.
TOAST
The Oversized-Attribute Storage Technique. A mechanism for storing large column values (like long text or JSONB) separately from the main table row to keep pages compact.
Tuple
The physical representation of a row version in a PostgreSQL page. Due to MVCC, multiple tuples may exist for the same logical row.
19. V
VACUUM
A maintenance process that reclaims storage occupied by dead tuples and updates the Visibility Map.
Visibility Map
A bitmap that tracks which pages in a table contain only tuples that are visible to all active transactions. Crucial for Index-Only Scans.
20. W
WAL
Write-Ahead Log. A log of changes ensuring data integrity. Changes are written to the WAL before being applied to data files.
WAL Receiver
A background process on the Standby server that receives WAL data from the Primary and writes it to the local WAL files.
WAL Sender
A background process on the Primary server that sends WAL data to Standby servers or logical decoding plugins.
Window Function
A function that performs a calculation across a set of table rows that are somehow related to the current row. Unlike regular aggregate functions, window functions do not cause rows to become grouped into a single output row.
«««< HEAD
Write Skew
A serialization anomaly where two transactions read overlapping data but update disjoint data based on what they read, violating consistency constraints. Only prevented by Serializable isolation.
21. X
xmax
A hidden system column in a tuple header that stores the Transaction ID (XID) of the transaction that deleted (or updated) the row version.
xmin
A hidden system column in a tuple header that stores the Transaction ID (XID) of the transaction that created (inserted) the row version. =======
Worker Node
In a Citus cluster, the nodes that store the actual data (shards) and execute the query fragments sent by the coordinator.
origin/feat-postgres-extensions-1314213636004454745