Understanding JMH: Java Microbenchmarking Made Simple

In the world of software development, performance matters. But how do we accurately measure and compare the performance of different implementations? This is where JMH (Java Microbenchmark Harness) comes into play. In this post, we'll explore JMH through a practical example of benchmarking trace ID generation methods.

What is JMH?

JMH is a Java harness for building, running, and analyzing nano/micro/milli/macro benchmarks written in Java and other languages targeting the JVM. It was developed by the OpenJDK team and is used extensively in the JDK itself to perform performance testing.

Setting Up JMH with Gradle

To get started with JMH, you'll need to add the necessary dependencies to your build configuration. Here's how to set it up in a Gradle project:

plugins {
    id 'java'
    id 'me.champeau.jmh' version '0.7.1'
    id 'io.morethan.jmhreport' version '0.9.0'
}

dependencies {
    implementation 'org.openjdk.jmh:jmh-core:1.37'
    implementation 'org.openjdk.jmh:jmh-generator-annprocess:1.37'
}

jmh {
    resultFormat = 'JSON'
    resultsFile = layout.buildDirectory.file('reports/jmh/results.json').get().asFile
    jmhVersion = '1.37'
    timeUnit = 'ns'
    threads = project.hasProperty('jmh.threads') ? project.property('jmh.threads').toInteger() : 1
}

Writing JMH Benchmarks

Let's look at a real-world example where we benchmark two different approaches to generating trace IDs: using UUID and using OpenTelemetry's IdGenerator.

@BenchmarkMode({Mode.AverageTime, Mode.Throughput})
@OutputTimeUnit(TimeUnit.NANOSECONDS)
@State(Scope.Thread)
@Warmup(iterations = 5, time = 1)
@Measurement(iterations = 10, time = 1)
@Fork(2)
public class TraceIdGeneratorBenchmark {
    
    private IdGenerator otelIdGenerator;

    @Setup
    public void setup() {
        otelIdGenerator = IdGenerator.random();
    }

    @Benchmark
    public void uuidBasedTraceId(Blackhole blackhole) {
        String traceId = UUID.randomUUID().toString();
        blackhole.consume(traceId);
    }

    @Benchmark
    public void openTelemetryTraceId(Blackhole blackhole) {
        String traceId = otelIdGenerator.generateTraceId();
        blackhole.consume(traceId);
    }
}

Understanding JMH Annotations

Let's break down the key JMH annotations:

@BenchmarkMode: Specifies what to measure. In our example, we measure both average time and throughput.
@OutputTimeUnit: Defines the unit for the results (nanoseconds in our case).
@State: Defines the scope of our benchmark state (Thread scope means each thread has its own copy).
@Warmup: Specifies warmup iterations to get the JVM into a steady state.
@Measurement: Defines how many measurement iterations to perform.
@Fork: Indicates how many separate JVM forks to use (helps eliminate external factors).

Running the Benchmark Project

Let's walk through setting up and running our trace ID generator benchmark project:

Project Setup

# Clone the repository
git clone https://github.com/GSSwain/benchmark-trace-id-generator.git
cd benchmark-trace-id-generator

# Run the benchmark with single thread
./gradlew clean jmh

# Run with multiple threads (e.g., 10 threads)
./gradlew clean jmh -Pjmh.threads=10

# Generate html report
./gradlew clean jmhReport

Understanding the Project Structure

The benchmark project includes:

JMH configuration in build.gradle
Benchmark implementation in src/jmh/java
Two trace ID generation methods:
- UUID-based: Using Java's built-in UUID generator
- OpenTelemetry: Using OpenTelemetry's RandomIdGenerator

Analyzing Benchmark Results

Single-Thread Performance (JDK 25)

Implementation	Average Time (ns)	Throughput (M ops/s)
OpenTelemetry	14.321 ±0.130	69.8 ±0.6
UUID	247.141 ±8.476	4.0 ±0.1

Multi-Thread Performance (10 Threads, JDK 25)

Implementation	Average Time (ns)	Throughput (M ops/s)
OpenTelemetry	24.967 ±1.525	400.5 ±24.4
UUID	3,761.317 ±65.991	2.7 ±0.1

Interpreting These Results

Let's break down what these numbers tell us:

1. Single-Thread Analysis

Average Time:
- OpenTelemetry: ~14.3 nanoseconds per operation
- UUID: ~247.1 nanoseconds per operation
- OpenTelemetry is approximately 17x faster
Throughput:
- OpenTelemetry: ~69.8 million ops/second
- UUID: ~4.0 million ops/second

2. Multi-Thread Analysis (10 Threads)

Average Time:
- OpenTelemetry: Only increases to ~25 nanoseconds (1.7x increase)
- UUID: Jumps to ~3,761 nanoseconds (15x increase)
- OpenTelemetry is now 150x faster
Throughput:
- OpenTelemetry: Increases to ~400.5 million ops/second (excellent scaling)
- UUID: Decreases to ~2.7 million ops/second (poor scaling)

3. Key Observations

Thread Scaling:
- OpenTelemetry shows excellent thread scaling (6x throughput improvement with 10 threads)
- UUID shows poor thread scaling (throughput actually decreases)
Consistency:
- OpenTelemetry has very small error margins (±0.130 to ±1.525)
- UUID shows larger variations (±8.476 to ±65.991)

For a complete breakdown of the results across different JDK versions and a deeper analysis of the real-world impact, please see the follow-up post: Trace ID Generation: A Performance Analysis of UUID vs. OpenTelemetry.

Best Practices

When writing JMH benchmarks, keep these points in mind:

Use Blackhole.consume() to prevent dead code elimination
Include proper warmup iterations to ensure JVM optimization
Run multiple forks to get statistically significant results
Consider external factors like garbage collection and JIT compilation
Document your benchmark environment (JVM version, available processors, etc.)

Conclusion

JMH is a powerful tool for measuring and comparing code performance on the JVM. While it requires careful setup and interpretation, it provides valuable insights into code performance characteristics. Remember that microbenchmarks should be one of many tools in your performance testing arsenal, alongside profiling and real-world performance testing.

The example used in this post can be found in the benchmark-trace-id-generator repository.