Virtual Threads: The Revolution (Project Loom)

For nearly 30 years, Java’s unit of concurrency has been the Thread, a thin wrapper around an operating system (OS) thread. This model served us well, but it has a fundamental scalability limit. Project Loom (delivered in Java 21) introduces Virtual Threads, a lightweight concurrency model that decouples Java threads from OS threads, allowing us to run millions of threads with minimal overhead.

1. The Problem with Platform Threads

To understand why Virtual Threads are revolutionary, we must first understand the limitations of traditional Platform Threads.

The 1:1 Model

In the traditional model, one Java thread equals one OS thread.

  • Memory Overhead: Each thread consumes ~2MB of stack memory (outside the heap). 1,000 threads = ~2GB of RAM just for stacks.
  • Context Switching: Switching between OS threads is expensive (kernel mode transition, register saving/restoring).
  • Scalability Cap: You effectively hit a ceiling at around 5,000 - 10,000 active threads per machine.

[!WARNING] This model forces a “thread-per-request” style to bottleneck on the number of threads rather than CPU or Network. If your server processes requests that spend 99% of their time waiting for a database (I/O), your expensive OS threads are just sitting idle.

2. Hardware Reality: Why is 1:1 Bad?

Let’s look at the physics of the machine.

  1. Kernel Space vs User Space: A Platform Thread is managed by the OS Kernel. Creating, destroying, or switching them requires a System Call (syscall). Syscalls are slow because the CPU must switch privilege levels (Ring 3 to Ring 0) and flush certain caches.
  2. Cache Locality: When the OS switches threads, the CPU cache (L1/L2) is often “cold” for the new thread. It has to fetch data from RAM, which is ~100x slower than L1 cache.
  3. The “Little’s Law” Limit: Throughput = Capacity / Latency. If your latency is high (I/O wait), you need massive capacity (threads) to maintain throughput. Platform threads are too heavy to provide that capacity.

3. Enter Virtual Threads (M:N Model)

Virtual Threads are user-mode threads managed by the JVM, not the OS. They are mapped M:N onto a small pool of OS threads (called Carrier Threads).

How It Works

  1. Mounting: When a virtual thread needs to run, the JVM “mounts” it onto a carrier thread (an OS thread).
  2. Execution: The code runs on the carrier thread.
  3. Unmounting: When the code performs a blocking I/O operation (e.g., socket.read()), the JVM unmounts the virtual thread.
    • The virtual thread’s stack frames are copied from the carrier’s stack to the heap (as a Continuation object).
    • The carrier thread is now free to run another virtual thread.
  4. Resuming: When the I/O completes, the OS signals the JVM, which puts the virtual thread back into the run queue. It will be mounted again (possibly on a different carrier thread) to continue execution.

[!TIP] This is often called “Collaborative Concurrency” in other languages (like Go’s Goroutines), but in Java, it’s completely transparent. You use the same blocking APIs (InputStream, JDBC, etc.), and the JVM handles the unmounting magic for you.

4. Interactive: The Loom Scheduler

Visualize how Carrier Threads handle multiple Virtual Threads. Notice how blocking operations free up the Carrier Thread.

Run Queue (Ready)

Carrier Threads (CPU)

Carrier-1 (OS Thread)
Carrier-2 (OS Thread)

Blocked (Wait Queue)

System initialized. Add tasks to begin.

5. Comparison: Virtual Threads vs Goroutines

Java’s Virtual Threads are conceptually similar to Go’s Goroutines. Both are M:N scheduled user-mode threads.

Feature Java Virtual Threads Go Goroutines
Model M:N (User Mode) M:N (User Mode)
Stack Size Dynamic (Starts small, grows) Dynamic (Starts ~2KB, grows)
Scheduling Preemptive (mostly) Cooperative (mostly)
Communication Shared State (Objects) + Queues Channels (CSP)
Blocking Standard Blocking APIs (JDBC, IO) Non-blocking I/O (netpoller)

Code Comparison

Java (Virtual Threads)

// Create 100k threads
try (var executor = Executors.newVirtualThreadPerTaskExecutor()) {
    for (int i = 0; i < 100_000; i++) {
        executor.submit(() -> {
            Thread.sleep(Duration.ofSeconds(1)); // Blocks virtually
            return "Done";
        });
    }
}

Go (Goroutines)

// Create 100k goroutines
var wg sync.WaitGroup
for i := 0; i < 100000; i++ {
    wg.Add(1)
    go func() {
        defer wg.Done()
        time.Sleep(1 * time.Second) // Blocks goroutine
    }()
}
wg.Wait()

Both achieve the same goal: high concurrency with low overhead. The key difference is that Java retrofitted this onto an existing ecosystem of blocking APIs, whereas Go was built with it from day one.

6. Creating Virtual Threads

Java 21 provides several ways to create virtual threads. The API is intentionally familiar.

1. The Thread Builder

You can create a single virtual thread using the builder API:

// Start immediately
Thread.startVirtualThread(() -> {
    System.out.println("Hello from " + Thread.currentThread());
});

// Using the builder
Thread vThread = Thread.ofVirtual()
    .name("my-virtual-thread")
    .start(() -> {
        System.out.println("Running task...");
    });

2. The Executor Service (Best Practice)

In most applications, you won’t create threads manually. You’ll use an ExecutorService.

try (var executor = Executors.newVirtualThreadPerTaskExecutor()) {
    // Submit 10,000 tasks
    for (int i = 0; i < 10_000; i++) {
        int index = i;
        executor.submit(() -> {
            // Simulate blocking I/O
            Thread.sleep(Duration.ofSeconds(1));
            return index;
        });
    }
} // executor.close() is called implicitly here, waiting for all tasks

[!IMPORTANT] Do NOT pool Virtual Threads. Unlike Platform Threads, they are cheap to create. Creating a new virtual thread for every task is the intended usage pattern. Pooling them defeats the purpose.

7. Performance: Throughput vs Latency

Virtual Threads improve Throughput (requests per second), not Latency (time per request).

  • If a single request takes 100ms on a Platform Thread, it will still take ~100ms on a Virtual Thread.
  • However, where a Platform Thread server might crash with 5,000 concurrent requests, a Virtual Thread server can handle 500,000 concurrent requests effortlessly.

When to use Virtual Threads?

  • I/O Bound Work: Database calls, REST API calls, File processing. This is 99% of enterprise Java apps.
  • CPU Bound Work: Video encoding, complex math, heavy cryptography. Since these tasks don’t block, they hog the Carrier Thread, preventing other virtual threads from running.

8. Key Takeaways

  1. M:N Scheduling: Many virtual threads map to few carrier threads.
  2. Blocking is Cheap: Blocking operations unmount the virtual thread, freeing the OS thread.
  3. No Pooling: Create virtual threads on-demand; they are disposable.
  4. Hardware Aware: Respects the physical limits of the machine (cache, context switches) by managing concurrency in user space.