When a concurrent system slows down, the real problem is often not “too little CPU.”
It is:
- too much waiting on the same lock
Lock contention profiling is about identifying where that waiting happens and which critical sections are creating it.
This is one of the most valuable performance diagnostics in multithreaded Java systems.
Problem Statement
Suppose throughput is low even though:
- CPUs are not fully saturated
- thread pools are busy
- queues are growing
One likely explanation is that many threads are serialized behind:
- one monitor
- one
ReentrantLock - one overly large critical section
If you do not identify that lock precisely, tuning thread counts usually will not help.
What Contention Usually Means
A contended lock is often a sign of one of these problems:
- too much work inside the critical section
- one shared hotspot protecting too much state
- wrong concurrency primitive for the workload
- accidental lock convoy behavior
The fix is not always “remove the lock.” Sometimes it is:
- shrink the critical section
- partition the state
- change data structures
- reduce blocking while holding the lock
Diagnostic Tools
Useful sources include:
- JFR lock or monitor-related events
- thread dumps showing many threads blocked on the same monitor
- application metrics around throughput and queueing
The goal is to connect:
- observed slowdown
to:
- specific lock ownership and blocked code paths
That is much more actionable than a vague statement that “threads are slow.”
Runnable Example of a Hot Lock Shape
public class ContentionDemo {
private final Object lock = new Object();
private int counter;
public void increment() {
synchronized (lock) {
counter++;
expensiveWorkInsideLock();
}
}
private void expensiveWorkInsideLock() {
try {
Thread.sleep(10);
} catch (InterruptedException e) {
Thread.currentThread().interrupt();
}
}
}
The lock here does not just protect counter.
It also serializes the expensive delay.
That is the classic shape of contention amplification.
What to Look For During Profiling
Focus on:
- the most contended monitors or locks
- methods repeatedly appearing in blocked stacks
- code paths that hold locks while performing slow work
Look for patterns such as:
- logging inside synchronized sections
- remote calls while holding locks
- large object graph mutations under one mutex
These are recurring production problems.
Common Mistakes
Assuming contention means locking is always wrong
Some locking is necessary and healthy. The issue is excessive serialization.
Chasing thread count before contention data
More threads can make contention worse.
Optimizing tiny locks instead of the hotspot
Most services have only a few critical contention points that matter materially.
Ignoring lock scope
Sometimes the same lock is correct, but the protected region is too large.
Practical Fix Directions
Once you identify the hotspot, common remedies include:
- move slow work outside the lock
- partition by key instead of using one global lock
- use concurrent collections or atomics where appropriate
- revisit invariants to reduce shared mutable state
The correct fix depends on the underlying ownership model, not on lock aversion.
What Healthy and Unhealthy Contention Look Like
Not all contention is a bug. A healthy system may show short waits on small critical sections while still maintaining good throughput and predictable latency. Unhealthy contention usually looks different:
- many threads pile up behind the same monitor or lock
- the owner spends too long inside the critical section
- queue depth and tail latency rise together
- adding threads stops helping and may even hurt
That distinction matters because the fix is not always “remove the lock.” Sometimes the right answer is to shrink the protected work, split the state, or move blocking I/O out of the critical section.
A Step by Step Investigation Flow
A practical investigation sequence is:
- identify the hottest contended lock or monitor
- inspect what work happens while holding it
- check how often that path executes and under which workload
- decide whether the state can be partitioned, ownership changed, or the critical section shortened
Profiling should lead to a design decision, not just to a screenshot of a hot lock. The important outcome is understanding why that piece of shared state became a bottleneck in the first place.
Key Takeaways
- Lock contention profiling helps find the real serialization points that limit throughput.
- The usual problem is too much work under one lock, not merely the existence of locking.
- JFR, thread dumps, and executor metrics together give a much clearer picture than any one signal alone.
- Fixes usually involve shrinking lock scope, partitioning state, or choosing a better concurrency design.
Next post: Benchmarking Concurrency Correctly with JMH
Comments