Understanding JMH: Java Microbenchmarking Made Simple
In the world of software development, performance matters. But how do we accurately measure and compare the performance of different implementations? This is where JMH (Java Microbenchmark Harness) comes into play. In this post, we'll explore JMH through a practical example of benchmarking trace ID generation methods.
What is JMH?
JMH is a Java harness for building, running, and analyzing nano/micro/milli/macro benchmarks written in Java and other languages targeting the JVM. It was developed by the OpenJDK team and is used extensively in the JDK itself to perform performance testing.
Setting Up JMH with Gradle
To get started with JMH, you'll need to add the necessary dependencies to your build configuration. Here's how to set it up in a Gradle project:
plugins {
id 'java'
id 'me.champeau.jmh' version '0.7.1'
id 'io.morethan.jmhreport' version '0.9.0'
}
dependencies {
implementation 'org.openjdk.jmh:jmh-core:1.37'
implementation 'org.openjdk.jmh:jmh-generator-annprocess:1.37'
}
jmh {
resultFormat = 'JSON'
resultsFile = layout.buildDirectory.file('reports/jmh/results.json').get().asFile
jmhVersion = '1.37'
timeUnit = 'ns'
threads = project.hasProperty('jmh.threads') ? project.property('jmh.threads').toInteger() : 1
}Writing JMH Benchmarks
Let's look at a real-world example where we benchmark two different approaches to generating trace IDs: using UUID and using OpenTelemetry's IdGenerator.
@BenchmarkMode({Mode.AverageTime, Mode.Throughput})
@OutputTimeUnit(TimeUnit.NANOSECONDS)
@State(Scope.Thread)
@Warmup(iterations = 5, time = 1)
@Measurement(iterations = 10, time = 1)
@Fork(2)
public class TraceIdGeneratorBenchmark {
private IdGenerator otelIdGenerator;
@Setup
public void setup() {
otelIdGenerator = IdGenerator.random();
}
@Benchmark
public void uuidBasedTraceId(Blackhole blackhole) {
String traceId = UUID.randomUUID().toString();
blackhole.consume(traceId);
}
@Benchmark
public void openTelemetryTraceId(Blackhole blackhole) {
String traceId = otelIdGenerator.generateTraceId();
blackhole.consume(traceId);
}
}Understanding JMH Annotations
Let's break down the key JMH annotations:
- @BenchmarkMode: Specifies what to measure. In our example, we measure both average time and throughput.
- @OutputTimeUnit: Defines the unit for the results (nanoseconds in our case).
- @State: Defines the scope of our benchmark state (Thread scope means each thread has its own copy).
- @Warmup: Specifies warmup iterations to get the JVM into a steady state.
- @Measurement: Defines how many measurement iterations to perform.
- @Fork: Indicates how many separate JVM forks to use (helps eliminate external factors).
Running the Benchmark Project
Let's walk through setting up and running our trace ID generator benchmark project:
Project Setup
# Clone the repository
git clone https://github.com/GSSwain/benchmark-trace-id-generator.git
cd benchmark-trace-id-generator
# Run the benchmark with single thread
./gradlew clean jmh
# Run with multiple threads (e.g., 10 threads)
./gradlew clean jmh -Pjmh.threads=10
# Generate html report
./gradlew clean jmhReportUnderstanding the Project Structure
The benchmark project includes:
- JMH configuration in build.gradle
- Benchmark implementation in src/jmh/java
- Two trace ID generation methods:
- UUID-based: Using Java's built-in UUID generator
- OpenTelemetry: Using OpenTelemetry's RandomIdGenerator
Analyzing Benchmark Results
Single-Thread Performance (JDK 25)
| Implementation | Average Time (ns) | Throughput (ops/ns) |
|---|---|---|
| OpenTelemetry | 14.321 ±0.130 | 0.069 ±0.001 |
| UUID | 247.141 ±8.476 | 0.004 ±0.001 |
Multi-Thread Performance (10 Threads, JDK 25)
| Implementation | Average Time (ns) | Throughput (ops/ns) |
|---|---|---|
| OpenTelemetry | 24.967 ±1.525 | 0.420 ±0.007 |
| UUID | 3,761.317 ±65.991 | 0.003 ±0.001 |
Interpreting These Results
Let's break down what these numbers tell us:
1. Single-Thread Analysis
- Average Time:
- OpenTelemetry: ~14.3 nanoseconds per operation
- UUID: ~247.1 nanoseconds per operation
- OpenTelemetry is approximately 17x faster
- Throughput:
- OpenTelemetry: 0.069 operations per nanosecond (69 million ops/second)
- UUID: 0.004 operations per nanosecond (4 million ops/second)
2. Multi-Thread Analysis (10 Threads)
- Average Time:
- OpenTelemetry: Only increases to ~25 nanoseconds (1.7x increase)
- UUID: Jumps to ~3,761 nanoseconds (15x increase)
- OpenTelemetry is now 150x faster
- Throughput:
- OpenTelemetry: Increases to 0.420 ops/ns (excellent scaling)
- UUID: Decreases to 0.003 ops/ns (poor scaling)
3. Key Observations
- Thread Scaling:
- OpenTelemetry shows excellent thread scaling (6x throughput improvement with 10 threads)
- UUID shows poor thread scaling (throughput actually decreases)
- Consistency:
- OpenTelemetry has very small error margins (±0.130 to ±1.525)
- UUID shows larger variations (±8.476 to ±65.991)
Best Practices
When writing JMH benchmarks, keep these points in mind:
- Use Blackhole.consume() to prevent dead code elimination
- Include proper warmup iterations to ensure JVM optimization
- Run multiple forks to get statistically significant results
- Consider external factors like garbage collection and JIT compilation
- Document your benchmark environment (JVM version, available processors, etc.)
Conclusion
JMH is a powerful tool for measuring and comparing code performance on the JVM. While it requires careful setup and interpretation, it provides valuable insights into code performance characteristics. Remember that microbenchmarks should be one of many tools in your performance testing arsenal, alongside profiling and real-world performance testing.
The example used in this post can be found in the benchmark-trace-id-generator repository.