Welcome to the Apache Kafka Glossary. Here you will find definitions for common abbreviations and technical terms used throughout the course.

Example Usage: Hover over ISR to see the definition.

Core Concepts

Term Full Name Definition
Topic Topic A logical stream of events (e.g., “orders”, “clicks”). Topics are the primary way of organizing messages in Kafka.
Partition Partition The unit of scalability in Kafka. A topic is split into multiple partitions, which can be distributed across different brokers. Ordering is guaranteed only within a partition.
Offset Offset A unique integer ID assigned to every message within a partition, representing its position in the log.
Broker Kafka Broker A single Kafka server. Brokers receive messages from producers, store them on disk, and serve them to consumers.
Producer Producer A client application that publishes (writes) events to Kafka topics.
Consumer Consumer A client application that subscribes to (reads) events from Kafka topics.
Consumer Group Consumer Group A group of consumers that work together to consume a topic. Each partition in the topic is consumed by exactly one consumer in the group.
Zookeeper Apache Zookeeper A centralized service for maintaining configuration information, naming, providing distributed synchronization, and group services. Used by older Kafka versions for metadata management.
KRaft Kafka Raft Metadata Mode The new consensus protocol in Kafka (replacing Zookeeper) that manages metadata and controller election internally.

Replication & Consistency

Term Full Name Definition
Leader Partition Leader The replica that handles all reads and writes for a partition.
Follower Partition Follower A replica that passively replicates the log from the leader. Followers exist for fault tolerance.
ISR In-Sync Replicas The set of replicas that are fully caught up with the leader. Only members of the ISR are eligible to become the new leader if the current leader fails.
High Watermark High Watermark The offset of the last message that has been successfully replicated to all ISR members. Messages up to this point are considered “committed” and visible to consumers.
LEO Log End Offset The offset of the last message appended to the leader’s log (regardless of replication status).
acks Acknowledgments A producer configuration (acks=0, 1, all) that determines how many replicas must acknowledge a write before it is considered successful.
min.insync.replicas Minimum In-Sync Replicas A configuration ensuring that a write is only accepted if at least N replicas (including the leader) acknowledge it.
Unclean Leader Election Unclean Leader Election A configuration allowing a non-ISR replica to become leader, potentially causing data loss but preserving availability.

Storage & Internals

Term Full Name Definition
Segment Log Segment A physical file on disk (e.g., 0000.log) that stores a portion of a partition’s data. Partitions are split into segments for easier management and deletion.
Index Offset Index A file (.index) that maps offsets to physical file positions in the log segment, enabling fast lookups.
TimeIndex Timestamp Index A file (.timeindex) that maps timestamps to offsets, allowing lookups by time.
Log Compaction Log Compaction A cleanup policy where Kafka retains at least the last known value for each message key, rather than deleting old messages based on time.
Zero Copy Zero Copy A technique used by Kafka to send data from the disk cache directly to the network socket without copying it to application memory, maximizing throughput.
Page Cache Page Cache The operating system’s main memory cache used to store file data. Kafka relies heavily on the page cache for performance.
Sticky Partitioner Sticky Partitioner A producer strategy that batches messages for the same partition to reduce latency and load, even if no key is provided.
Rebalancing Group Rebalancing The process where a Consumer Group redistributes partitions among its members (e.g., when a consumer joins or leaves).