Garbage Collection
The Garbage Collector (GC) is the unsung hero of the Java ecosystem. Unlike C++, where a forgotten delete causes a memory leak and a double free crashes the program, the JVM handles memory lifecycle automatically.
However, this convenience comes at a cost: Latency.
1. The Generational Hypothesis
Most Garbage Collectors are built on a simple observation about object lifecycles, known as the Weak Generational Hypothesis:
- Most objects die young: Temporary variables, iterators, and DTOs usually become unreachable within milliseconds.
- Few references exist from old to young: Older objects (caches, singletons) rarely point to newly created objects.
The Heap Structure
Based on this, the Heap is divided into:
- Young Generation (Eden + Survivors):
- New objects are allocated in Eden.
- When Eden fills up, a Minor GC runs. It is fast because most objects are dead.
- Survivors move to Survivor Spaces (S0/S1).
- Old Generation (Tenured):
- Objects that survive typically 15 GC cycles are promoted here.
- When Old Gen fills up, a Major GC (or Full GC) runs. This is slow and involves the entire heap.
2. Stop-The-World (STW)
To move an object safely (defragmentation), the GC must ensure no thread is accessing it. To do this, it pauses all application threads.
[!WARNING] The Latency Killer: In legacy collectors (Parallel/CMS), a Full GC on a large heap (e.g., 64GB) could pause the application for seconds. This is unacceptable for modern microservices.
3. Modern Collectors: The Big Three
G1 (Garbage First) - The Default
Introduced in Java 7, default since Java 9.
- Mechanism: Divides the heap into 2048+ small regions. Some are Young, some are Old.
- Goal: Meets a user-defined pause time target (e.g.,
-XX:MaxGCPauseMillis=200). - Pros: Balanced throughput and latency. No long Full GC pauses (mostly).
- Cons: Still has STW pauses for evacuation.
ZGC (The Scalable Low-Latency Collector)
Available since JDK 15, Generational since JDK 21.
- Goal: <1ms max pause time, regardless of heap size (10MB to 16TB).
- Magic: Uses Colored Pointers and Load Barriers.
- Colored Pointers: Metadata is stored in unused bits of the 64-bit reference address.
- Load Barrier: A tiny code snippet injected at every object access that checks if the object has moved and redirects the pointer if necessary (“Self-Healing”).
- Pros: Near-zero latency. Scalable.
- Cons: Slightly lower throughput (CPU overhead) than G1 due to barriers.
Shenandoah
Developed by Red Hat. Similar goals to ZGC.
- Magic: Uses Brooks Pointers (a forwarding pointer in the object header).
- Pros: Concurrent compaction.
- Cons: Memory overhead for the extra pointer.
4. Interactive: Generational GC Simulator
Watch objects be born in Eden, survive collections, and eventually retire to the Old Generation.
5. Tuning Flags Cheat Sheet
Don’t blindly copy flags from StackOverflow. Understand them.
| Flag | Purpose | Best For |
|---|---|---|
-XX:+UseG1GC |
Enable G1 Collector | General purpose (Default). |
-XX:+UseZGC |
Enable ZGC | Low latency requirements. |
-XX:+UseZGenerational |
Enable Gen ZGC | High throughput + Low latency (JDK 21+). |
-Xmx4g |
Max Heap Size | preventing OOM. |
-Xms4g |
Initial Heap Size | Set equal to Xmx to avoid resizing overhead. |
-XX:MaxGCPauseMillis=200 |
Target Pause Time | Tuning G1 aggressiveness. |
6. Summary
- Generational Hypothesis: Most objects die young, so we optimize for that (Minor GC).
- G1 is the safe default for balanced performance.
- ZGC is the modern choice for strict SLA applications where pauses > 1ms are unacceptable.