Garbage Collection
The Garbage Collector (GC) is the unsung hero of the Java ecosystem. Unlike C++, where a forgotten delete causes a memory leak and a double free crashes the program, the JVM handles memory lifecycle automatically.
However, this convenience comes at a cost: Latency.
1. The Generational Hypothesis
Most Garbage Collectors are built on a simple observation about object lifecycles, known as the Weak Generational Hypothesis:
- Most objects die young: Temporary variables, iterators, and DTOs usually become unreachable within milliseconds.
- Few references exist from old to young: Older objects (caches, singletons) rarely point to newly created objects.
The Heap Structure
[!NOTE] Analogy: The Busy Restaurant Think of JVM memory like a bustling restaurant.
- Young Generation (Eden) is the dining area. Customers (temporary variables, loop iterators, DTOs) come in, eat quickly, and leave. The tables are cleared (Minor GC) very frequently.
- Old Generation (Tenured) is the kitchen equipment. The stoves and refrigerators (long-lived objects like singletons, caches, connection pools) stay there for weeks. You wouldn’t want to shut down the entire restaurant to inventory the kitchen equipment every 5 minutes. That’s why the JVM separates them, sweeping the dining area constantly but rarely pausing to clean the entire kitchen.
Based on this, the Heap is divided into:
- Young Generation (Eden + Survivors):
- New objects are allocated in Eden.
- When Eden fills up, a Minor GC runs. It is fast because most objects are dead.
- Survivors move to Survivor Spaces (S0/S1).
- Old Generation (Tenured):
- Objects that survive typically 15 GC cycles are promoted here.
- When Old Gen fills up, a Major GC (or Full GC) runs. This is slow and involves the entire heap.
2. The Mark-Sweep-Compact Algorithm
Before diving into specific collectors, you need to understand the fundamental three-step process most GCs use:
- Mark: The GC starts from “GC Roots” (active threads, static variables, local variables) and traverses the object graph. Every object it can reach is marked as “alive”.
- Sweep: The GC scans the heap memory. Any memory occupied by unmarked (dead) objects is reclaimed and added to a free list.
- Compact (Optional but common): Sweeping leaves memory fragmented (like a hard drive). Compaction moves all living objects to one end of the heap, ensuring new allocations are contiguous and fast.
3. Stop-The-World (STW)
To move an object safely (defragmentation), the GC must ensure no thread is accessing it. To do this, it pauses all application threads.
[!WARNING] The Latency Killer: In legacy collectors (Parallel/CMS), a Full GC on a large heap (e.g., 64GB) could pause the application for seconds. This is unacceptable for modern microservices.
4. Modern Collectors: The Big Three
G1 (Garbage First) - The Default
Introduced in Java 7, default since Java 9.
- Mechanism: Divides the heap into 2048+ small regions. Some are Young, some are Old.
- Goal: Meets a user-defined pause time target (e.g.,
-XX:MaxGCPauseMillis=200). - Pros: Balanced throughput and latency. No long Full GC pauses (mostly).
- Cons: Still has STW pauses for evacuation.
ZGC (The Scalable Low-Latency Collector)
Available since JDK 15, Generational since JDK 21.
- Goal: <1ms max pause time, regardless of heap size (10MB to 16TB).
- Magic: Uses Colored Pointers and Load Barriers.
- Colored Pointers: Metadata is stored in unused bits of the 64-bit reference address.
- Load Barrier: A tiny code snippet injected at every object access that checks if the object has moved and redirects the pointer if necessary (“Self-Healing”).
- Pros: Near-zero latency. Scalable.
- Cons: Slightly lower throughput (CPU overhead) than G1 due to barriers.
Shenandoah
Developed by Red Hat. Similar goals to ZGC.
- Magic: Uses Brooks Pointers (a forwarding pointer in the object header).
- Pros: Concurrent compaction.
- Cons: Memory overhead for the extra pointer.
5. Interactive: Generational GC Simulator
Watch objects be born in Eden, survive collections, and eventually retire to the Old Generation.
6. Tuning Flags Cheat Sheet
Don’t blindly copy flags from StackOverflow. Understand them.
| Flag | Purpose | Best For |
|---|---|---|
-XX:+UseG1GC |
Enable G1 Collector | General purpose (Default). |
-XX:+UseZGC |
Enable ZGC | Low latency requirements. |
-XX:+UseZGenerational |
Enable Gen ZGC | High throughput + Low latency (JDK 21+). |
-Xmx4g |
Max Heap Size | preventing OOM. |
-Xms4g |
Initial Heap Size | Set equal to Xmx to avoid resizing overhead. |
-XX:MaxGCPauseMillis=200 |
Target Pause Time | Tuning G1 aggressiveness. |
7. Summary
- Generational Hypothesis: Most objects die young, so we optimize for that (Minor GC).
- G1 is the safe default for balanced performance.
- ZGC is the modern choice for strict SLA applications where pauses > 1ms are unacceptable.