Thread dumps are excellent snapshots. JFR is better when you need a time-based story.
That is why Java Flight Recorder is so useful for concurrency diagnostics.
It helps answer questions like:
- where are threads blocking over time
- which locks are hot
- what kinds of stalls are recurring
- when did the slowdown start
Those are hard questions to answer from one static dump alone.
Problem Statement
Many concurrency incidents are not single moments. They are patterns over time:
- rising lock contention
- frequent thread parking
- bursts of blocked monitors
- starvation around pool saturation
If you only inspect one dump, you may miss the timeline.
JFR adds time, frequency, and event context to the diagnosis.
Mental Model
Think of JFR as low-overhead event recording for JVM behavior.
For concurrency work, it can help surface:
- monitor contention
- thread park behavior
- blocking patterns
- execution hotspots near contended code paths
It is strongest when you need to correlate:
- application slowdown
- JVM thread behavior
- lock or waiting patterns
over an interval rather than at one instant.
Useful Commands
One common way to start a recording is:
jcmd <pid> JFR.start name=concurrency settings=profile duration=5m filename=concurrency.jfr
You can then inspect the recording in tools that understand JFR data.
The exact tooling matters less than the workflow:
- capture during or near the incident
- inspect blocking and contention related events
- correlate with the time window of bad behavior
What to Look For
Useful concurrency-oriented questions include:
- which threads are parking frequently
- whether lock contention is concentrated on a few classes or methods
- whether the application is spending time blocked rather than computing
- whether pool threads appear underutilized or stuck behind waiting dependencies
JFR helps you find repeated patterns, not just one dramatic stack trace.
Why JFR Is Often Better Than Guessing
Teams often jump from:
- “latency is bad”
to:
- “we need more threads”
or:
- “the database must be slow”
JFR is useful because it replaces intuition with evidence about:
- actual blocking
- actual contention
- actual waiting behavior
That narrows root-cause search much faster.
Common Mistakes
Capturing a recording without a clear incident window
You still need context from metrics and timestamps.
Looking only at CPU hotspots
Concurrency issues are often about blocked time, not just hot methods.
Using JFR without thread names or pool clarity
Good thread naming makes event interpretation far easier.
Treating JFR as a replacement for thread dumps
It complements dumps. It does not make them obsolete.
Practical Guidance
Use JFR when:
- the system is slow but not obviously crashed
- contention is suspected
- thread dumps alone feel too static
- you need evidence from a period of time
For the strongest diagnosis, combine:
- JFR recording
- thread dumps
- executor metrics
- request latency graphs
Concurrency incidents are rarely explained by one signal alone.
A Practical Capture Strategy
JFR is most useful when capture is intentional. Instead of starting recordings randomly after the system is already back to normal, define a simple incident playbook:
- when latency crosses a threshold, capture a short recording
- keep thread names, pool names, and deployment version available alongside it
- align the recording window with application metrics and logs
That turns JFR from an expert-only tool into a repeatable operational step. A good capture strategy is less about one perfect command and more about collecting evidence while the behavior is actually happening.
Correlate JFR with Other Signals
JFR events become much more informative when you line them up with:
- request latency spikes
- pool queue growth
- GC activity
- database or HTTP client error bursts
This matters because blocked time is often only the visible symptom. The root cause may sit in a downstream dependency or in one overloaded executor. JFR gives the thread-behavior side of the story; the rest of the telemetry tells you why the story unfolded that way.
Second Command Example: Start, Dump, and Stop
Another practical capture shape is to start a named recording, dump it when the incident window is active, and then stop it cleanly.
jcmd <pid> JFR.start name=incident settings=profile
jcmd <pid> JFR.dump name=incident filename=incident.jfr
jcmd <pid> JFR.stop name=incident
This scenario is useful when the incident timing is uncertain and you want more control than a fixed duration gives you.
Key Takeaways
- JFR adds a time dimension to concurrency diagnosis that thread dumps alone do not provide.
- It is especially useful for recurring blocking, contention, and parking patterns.
- JFR works best when paired with metrics, good thread naming, and incident timestamps.
- Use it to replace guesswork with evidence about how threads actually behaved over the slowdown window.
Next post: Lock Contention Profiling in Java
Comments