Module Review: JVM Performance

[!NOTE] This module explores the core principles of Module Review: JVM Performance, deriving solutions from first principles and hardware constraints to build world-class, production-ready expertise.

1. Key Takeaways

  • JIT is Lazy: Java starts interpreting code (slow) and compiles hot paths to native machine code (fast) using Tiered Compilation (C1 → C2).
  • Inlining is King: The JIT’s ability to inline methods enables almost all other optimizations (Dead Code Elimination, Escape Analysis).
  • ZGC is Scalable: Modern GCs like ZGC and Shenandoah use Concurrent Compaction to keep pause times under 1ms, regardless of heap size (up to 16TB).
  • Observe, Don’t Guess: Use JDK Flight Recorder (JFR) for always-on production monitoring (<1% overhead) and Async Profiler for accurate CPU flame graphs.
  • Deoptimization: The JVM can “un-optimize” code if runtime assumptions change (e.g., a new class is loaded), providing safety that static compilers lack.

2. Interactive Flashcards

Test your retention of the core concepts.

What is "Tiered Compilation"?

(Click to reveal)

The JVM's strategy of starting with the Interpreter (slow), then moving to C1 (fast compilation), and finally to C2 (highly optimized compilation) for hot methods.

What does "Inlining" do?

(Click to reveal)

It replaces a method call with the body of the called method. This eliminates call overhead and enables further optimizations like Dead Code Elimination.

How does ZGC achieve <1ms pauses?

(Click to reveal)

By performing marking, relocation, and remapping concurrently with application threads, using Colored Pointers and Load Barriers to manage object references.

What is "Safepoint Bias" in profiling?

(Click to reveal)

When a profiler only samples threads at "safe points" (e.g., loop ends), potentially missing hot code inside long loops or native calls. Async Profiler solves this.

What is "Deoptimization"?

(Click to reveal)

The process where the JVM discards compiled code and reverts to the interpreter when a speculative assumption (e.g., "class X has no subclasses") is proven wrong.

3. Performance Cheat Sheet

Category Flag / Tool Description
GC -XX:+UseZGC Enable Z Garbage Collector (Low Latency).
GC -XX:+ZGenerational Enable Generational ZGC (Better Throughput, JDK 21+).
GC -Xmx<size> Set max heap size (e.g., -Xmx4G).
JIT -XX:+PrintCompilation Print when methods are compiled.
JIT -XX:CompileThreshold=N Number of calls before JIT compilation triggers.
Profiling jcmd <pid> JFR.start Start JFR recording on a running process.
Profiling jcmd <pid> JFR.dump Dump JFR data to disk.
Profiling jmap -dump:live,format=b,file=heap.bin <pid> Take a heap dump (Warning: STW pause!).

4. Production Readiness Checklist

Before deploying your high-performance Java application, verify these items:

Check Item Why?
JFR Enabled Ensure JFR is running (typically default in newer JDKs, but check flags).
Heap Sized Correctly Set -Xmx and -Xms explicitly. Don’t rely on defaults for containers.
GC Selected Explicitly choose G1, ZGC, or Parallel based on your latency/throughput needs.
Warm-up Probes Ensure Kubernetes Readiness probes account for JIT warm-up time.
OOM Kill Handling Ensure -XX:+ExitOnOutOfMemoryError is set for container restarts.

5. Glossary

For definitions of terms like Safepoint, TLAB, and Metaspace, visit the Java Course Glossary.

6. Next Steps

Now that you understand how the JVM executes your code, it’s time to learn how to write code that executes in parallel.

Next Module: Concurrency & Multi-threading