Thread dumps are excellent snapshots. JFR is better when you need a time-based story.

That is why Java Flight Recorder is so useful for concurrency diagnostics.

It helps answer questions like:

  • where are threads blocking over time
  • which locks are hot
  • what kinds of stalls are recurring
  • when did the slowdown start

Those are hard questions to answer from one static dump alone.


Problem Statement

Many concurrency incidents are not single moments. They are patterns over time:

  • rising lock contention
  • frequent thread parking
  • bursts of blocked monitors
  • starvation around pool saturation

If you only inspect one dump, you may miss the timeline.

JFR adds time, frequency, and event context to the diagnosis.


Mental Model

Think of JFR as low-overhead event recording for JVM behavior.

For concurrency work, it can help surface:

  • monitor contention
  • thread park behavior
  • blocking patterns
  • execution hotspots near contended code paths

It is strongest when you need to correlate:

  • application slowdown
  • JVM thread behavior
  • lock or waiting patterns

over an interval rather than at one instant.


Useful Commands

One common way to start a recording is:

jcmd <pid> JFR.start name=concurrency settings=profile duration=5m filename=concurrency.jfr

You can then inspect the recording in tools that understand JFR data.

The exact tooling matters less than the workflow:

  • capture during or near the incident
  • inspect blocking and contention related events
  • correlate with the time window of bad behavior

What to Look For

Useful concurrency-oriented questions include:

  • which threads are parking frequently
  • whether lock contention is concentrated on a few classes or methods
  • whether the application is spending time blocked rather than computing
  • whether pool threads appear underutilized or stuck behind waiting dependencies

JFR helps you find repeated patterns, not just one dramatic stack trace.


Why JFR Is Often Better Than Guessing

Teams often jump from:

  • “latency is bad”

to:

  • “we need more threads”

or:

  • “the database must be slow”

JFR is useful because it replaces intuition with evidence about:

  • actual blocking
  • actual contention
  • actual waiting behavior

That narrows root-cause search much faster.


Common Mistakes

Capturing a recording without a clear incident window

You still need context from metrics and timestamps.

Looking only at CPU hotspots

Concurrency issues are often about blocked time, not just hot methods.

Using JFR without thread names or pool clarity

Good thread naming makes event interpretation far easier.

Treating JFR as a replacement for thread dumps

It complements dumps. It does not make them obsolete.


Practical Guidance

Use JFR when:

  • the system is slow but not obviously crashed
  • contention is suspected
  • thread dumps alone feel too static
  • you need evidence from a period of time

For the strongest diagnosis, combine:

  • JFR recording
  • thread dumps
  • executor metrics
  • request latency graphs

Concurrency incidents are rarely explained by one signal alone.


A Practical Capture Strategy

JFR is most useful when capture is intentional. Instead of starting recordings randomly after the system is already back to normal, define a simple incident playbook:

  • when latency crosses a threshold, capture a short recording
  • keep thread names, pool names, and deployment version available alongside it
  • align the recording window with application metrics and logs

That turns JFR from an expert-only tool into a repeatable operational step. A good capture strategy is less about one perfect command and more about collecting evidence while the behavior is actually happening.

Correlate JFR with Other Signals

JFR events become much more informative when you line them up with:

  • request latency spikes
  • pool queue growth
  • GC activity
  • database or HTTP client error bursts

This matters because blocked time is often only the visible symptom. The root cause may sit in a downstream dependency or in one overloaded executor. JFR gives the thread-behavior side of the story; the rest of the telemetry tells you why the story unfolded that way.

Second Command Example: Start, Dump, and Stop

Another practical capture shape is to start a named recording, dump it when the incident window is active, and then stop it cleanly.

jcmd <pid> JFR.start name=incident settings=profile
jcmd <pid> JFR.dump name=incident filename=incident.jfr
jcmd <pid> JFR.stop name=incident

This scenario is useful when the incident timing is uncertain and you want more control than a fixed duration gives you.

Key Takeaways

  • JFR adds a time dimension to concurrency diagnosis that thread dumps alone do not provide.
  • It is especially useful for recurring blocking, contention, and parking patterns.
  • JFR works best when paired with metrics, good thread naming, and incident timestamps.
  • Use it to replace guesswork with evidence about how threads actually behaved over the slowdown window.

Next post: Lock Contention Profiling in Java

Comments