Home-Software Development-Evaluating Java Virtual Threads: Performance and Practical Insights
Java Virtual Threads

Evaluating Java Virtual Threads: Performance and Practical Insights

Virtual threads were introduced in JDK 19, improved in JDK 20, and finalized in JDK 21, as outlined in JEP 444. Historically, Java applications used a “thread-per-request” model, each request handled by a dedicated OS thread. However, OS threads are memory-intensive and can cause scaling issues.

Advantages of Virtual Threads

Virtual threads maintain the simplicity of the thread-per-request model but with less resource usage. Initially created as lightweight Java heap objects, they only use OS threads when necessary, allowing millions of threads to run efficiently in a single JVM. This many-to-one relationship between virtual threads and OS threads optimizes system resources.

Open Liberty’s Autonomic Thread Pool

Open Liberty employs a shared thread pool to reduce the costs associated with OS threads. The Liberty thread pool adapts autonomically, optimizing the number of threads without requiring extensive tuning. This approach ensures efficient CPU resource utilization, especially in containerized environments with limited CPU allocations.

Java virtual threads and Open Liberty’s adaptive thread pool together enhance scalability and performance for Java applications, ensuring efficient resource utilization and simplified thread management.

Performance Tests: Liberty Thread Pool vs. Virtual Threads

We focused on use cases and configurations commonly used by Liberty customers, using benchmark applications to compare the performance of Liberty’s thread pool and virtual threads. These applications use REST and MicroProfile, performing basic business logic during transactions.

Evaluation Focus

Our primary evaluation was on configurations with 10s-100s of threads, modeling typical Liberty user scenarios. We also tested with thousands of threads to assess the advertised strength of virtual threads.

Test Environment

We conducted tests with both Eclipse Temurin (OpenJDK with HotSpot JVM) and IBM Semeru Runtimes (OpenJDK with OpenJ9 JVM), noting similar performance differentials. Results were primarily from Liberty 23.0.0.10 GA with Temurin 21.0.1_12.

Specific Use Case: Online Banking Simulation

An online banking app simulated requests to a remote system with configurable delays, allowing threads to be blocked on I/O. This scenario tested virtual thread unmount and mount actions, showcasing virtual threads’ ability to share OS threads efficiently.

Disclaimer

Our evaluation was focused on whether replacing Liberty’s autonomic thread pool with virtual threads would benefit Liberty users. Results may differ for other application runtimes without a self-tuning thread pool like Liberty’s.

This study provided insights into how virtual threads could potentially enhance the performance of Liberty’s thread pool, particularly in scenarios involving high thread counts and I/O-bound operations.

Test Case 1: CPU Throughput

Objective

Evaluate CPU throughput to determine performance differences between using virtual threads and Liberty’s thread pool.

Findings

In CPU-intensive applications, workloads showed a 10-40% lower throughput with virtual threads compared to Liberty’s thread pool.

Test Setup

  • Used Apache JMeter to simulate various loads on CPU-intensive apps.
  • Measured transactions per second (TPS) with increasing load levels.
  • Example: Online banking app with a 2 ms delay to test mount/unmount/remount actions.

Results

  • At low loads, virtual threads performed similarly to Liberty’s thread pool.
  • At higher loads, TPS with virtual threads decreased compared to Liberty’s thread pool.

Analysis

  • Virtual threads have overheads like mounting/unmounting and garbage collection.
  • Loss of thread-linked context (e.g., ThreadLocal variables) impacts efficiency.
  • Profiling indicated overheads weren’t large enough to fully explain the throughput discrepancy.

Virtual threads did not provide faster execution for CPU-intensive applications compared to traditional Java platform threads in Liberty’s thread pool, especially on systems with limited CPUs.

Test Case 2: Ramp-Up Time

Objective

Quantify how quickly virtual threads reach full throughput compared to Liberty’s thread pool under heavy load.

Findings

Apps using virtual threads reach maximum throughput significantly faster than those using Liberty’s thread pool when a heavy load is applied.

Test Setup

  • Used the online banking application with high latency to generate several thousand simultaneous transactions.
  • Compared ramp-up times for virtual threads and Liberty’s thread pool.

Results

  • Liberty’s thread pool handled thousands of threads without instability.
  • Throughput was slightly faster (2-3%) with Liberty’s thread pool, using 10% less CPU and 12-15% higher transactions per CPU utilization.
  • Virtual threads reached peak throughput almost immediately.
  • Liberty’s thread pool ramped up more slowly, taking tens of minutes to adjust fully to the load.

Improvements

  • Adjusted Liberty’s thread pool autonomics to grow more aggressively when idle CPU resources are available.
  • Post-adjustment, Liberty’s thread pool reached peak throughput within 20-30 seconds of virtual threads under heavy load.

Virtual threads provide a faster ramp-up time compared to Liberty’s thread pool, which is now significantly improved with more aggressive scaling adjustments.

Test Case 3: Memory Footprint

Objective

Determine how much memory is used by the Java process under constant load for both virtual threads and Liberty’s thread pool.

Findings

Virtual threads have a smaller per-thread memory footprint compared to platform threads, but this advantage may be offset by other memory usage factors in the JVM.

Memory Usage Comparison

Virtual threads use less memory because they don’t need dedicated OS threads, leading to lower Java process size. However, this benefit can be affected by other factors like DirectByteBuffers (DBBs), which impact memory usage variability.

DirectByteBuffers Impact

DBBs, part of the Java networking infrastructure, can significantly affect memory usage due to their allocation and retention patterns, particularly when their Java reference objects survive to old-gen space and are only collected during global GC cycles.

Test Configuration

  • Small minimum heap size and large maximum heap size to highlight heap memory variability.
  • Mixed results observed: virtual threads sometimes used less memory, but in other scenarios, they used more due to factors like DBBs retention.

Example Scenario

A 10% increase in workload resulted in a 25% decrease in memory usage for Liberty’s thread pool but a 185% increase for virtual threads. This was due to DBBs retention variability and other JVM factors.

Memory footprint differences between virtual threads and Liberty’s thread pool can vary based on several factors. While virtual threads generally use less native memory, overall memory usage may not always be lower. Results depend on the specific workload and JVM configuration.

Summary and Conclusions

Throughput

Virtual threads performed worse than Liberty’s thread pool across various apps. The poor performance varied based on the number of CPUs, task duration, Linux kernel level, and Linux scheduler tuning.

Ramp-up

Virtual threads reached full throughput more quickly during burst workloads requiring many threads, but the advantage diminished rapidly.

Memory Footprint

The smaller per-thread memory footprint of virtual threads had a relatively small impact when a few hundred threads were used, and was often outweighed by other JVM memory usage factors.

Unexpected Findings

A significant performance issue with virtual threads was traced to an interaction between the Linux kernel scheduler and Java’s ForkJoinPool thread management. This problem persisted across different kernel versions.

Conclusion

While virtual threads have potential advantages in high concurrency situations, Liberty’s thread pool generally provides better or comparable performance at moderate concurrency levels. Java developers can still use virtual threads in their applications running on Liberty, but for now, the Liberty thread pool remains the preferred choice for most use cases. The insights shared in this study aim to help Java developers make informed decisions about implementing virtual threads in their applications.

logo softsculptor bw

Experts in development, customization, release and production support of mobile and desktop applications and games. Offering a well-balanced blend of technology skills, domain knowledge, hands-on experience, effective methodology, and passion for IT.

Search

© All rights reserved 2012-2024.