Semaphore in Java Deep Dive

Semaphore is the coordination primitive for permit-based access.

It is not primarily about protecting one critical section with ownership rules. It is about allowing up to N concurrent acquisitions of some shared capacity.

That makes it valuable for:

resource pools
bounded concurrency
admission control
controlled access to scarce downstream capacity

Problem Statement

Suppose a service can call an external API safely only a limited number of times in parallel.

If 200 request threads all hit that API at once, you may trigger:

timeouts
connection pool exhaustion
downstream throttling
cascading latency spikes

The system needs a coordination boundary that says:

at most N callers may enter this region at the same time

That is the natural domain of Semaphore.

Mental Model

A semaphore manages a number of permits.

Basic flow:

the semaphore starts with N permits
a thread calls acquire() to take one permit
if a permit is available, the thread proceeds
if not, it waits
when work finishes, the thread calls release()
another waiter may now proceed

This is not the same as a lock:

a lock typically protects exclusive ownership
a semaphore models bounded capacity

That difference matters because it shapes both the API and the use cases.

Core API

Important methods:

new Semaphore(permits): non-fair semaphore
new Semaphore(permits, true): fair semaphore
acquire(): wait indefinitely for a permit
tryAcquire(): attempt without waiting
tryAcquire(timeout, unit): bounded wait
release(): return a permit
availablePermits(): inspect remaining permits

There are also bulk forms such as:

acquire(n)
release(n)

Those are useful when one task consumes more than one unit of shared capacity.

Runnable Example

This example limits concurrent downstream calls to three at a time even if many client requests arrive together.

import java.util.ArrayList;
import java.util.List;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;
import java.util.concurrent.Future;
import java.util.concurrent.Semaphore;
import java.util.concurrent.TimeUnit;

public class SemaphoreDemo {

    public static void main(String[] args) throws Exception {
        DownstreamClient client = new DownstreamClient(3);
        ExecutorService executor = Executors.newFixedThreadPool(8);
        List<Future<String>> futures = new ArrayList<>();

        for (int i = 1; i <= 8; i++) {
            final int requestId = i;
            futures.add(executor.submit(() -> client.fetch("request-" + requestId)));
        }

        for (Future<String> future : futures) {
            System.out.println(future.get());
        }

        executor.shutdown();
    }

    static final class DownstreamClient {
        private final Semaphore permits;

        DownstreamClient(int maxConcurrentCalls) {
            this.permits = new Semaphore(maxConcurrentCalls);
        }

        String fetch(String requestId) throws InterruptedException {
            permits.acquire();
            try {
                System.out.println("Calling downstream for " + requestId
                        + ", permits left=" + permits.availablePermits());
                TimeUnit.MILLISECONDS.sleep(300);
                return "response-for-" + requestId;
            } finally {
                permits.release();
            }
        }
    }
}

The invariant here is simple:

at most three calls are in flight at once

This is not a shared-state mutation problem. It is a bounded-access problem.

Fair vs Non-Fair Semaphores

Like locks, semaphores can be fair or non-fair.

Non-fair semaphores:

usually provide better throughput
may allow barging by newly arriving threads

Fair semaphores:

serve waiters in a more ordered way
may reduce starvation risk
often cost more in throughput

The same practical rule applies here as with fair locks:

choose fairness only when the workload and latency behavior justify it

Do not enable it automatically because it sounds morally cleaner.

Production-Style Use Cases

Strong fits include:

limiting concurrent HTTP calls to one dependency
protecting a finite number of device or socket slots
allowing only a small number of expensive report jobs at once
capping image transcoding or PDF rendering concurrency

Examples:

only 20 concurrent invoice exports per node
only 5 concurrent full-cache rebuilds
only 50 active uploads through a memory-heavy path

These all represent capacity management rather than mutual exclusion.

Common Mistakes

Forgetting `release()` in `finally`

This is the semaphore version of forgetting to unlock.

If a permit is leaked, capacity shrinks over time until callers block forever or time out.

Treating it like a lock

A semaphore does not communicate exclusive ownership the way a mutex does.

If your requirement is “only one thread may mutate this invariant at a time,” a lock is usually the clearer tool.

Using indefinite waits everywhere

In service code, tryAcquire(timeout, unit) is often safer than acquire() because it lets overload fail fast instead of silently piling up waiters.

Confusing concurrency limiting with rate limiting

A semaphore naturally limits how many operations run at once. It does not by itself guarantee “only X requests per second.”

That distinction is important enough that the next semaphore post expands it separately.

Testing and Debugging Notes

Good semaphore diagnostics include:

current queueing behavior
permit leaks
time spent waiting to acquire
paths that time out frequently

In tests, semaphores are useful for deliberately controlling concurrency:

allow only one or two operations in
block the next caller
assert that overload handling behaves correctly

If a system appears stuck around a semaphore, investigate:

missing release()
too few initial permits
permit acquisition on code paths that should not block
timeouts being swallowed and retried badly

Performance and Trade-Offs

Semaphores are powerful because they give you a direct way to bound concurrency.

But bounding concurrency is not automatically enough.

You still need to ask:

what should happen when permits are exhausted
wait, fail fast, queue elsewhere, or degrade gracefully

That policy question is often more important than the semaphore itself.

Semaphores control admission. They do not decide overload strategy for you.

Decision Guide

Use Semaphore when:

the core concept is permits or slots
several callers may proceed, but only up to a bounded number
you need concurrency limiting rather than one-thread ownership

Do not use it when:

the real problem is a shared invariant that needs exclusive mutation
you need repeated round synchronization among peers
you actually want task completion or future composition

Choose:

a lock for exclusive critical sections
CountDownLatch or barriers for group coordination
CompletableFuture for async result flows

Key Takeaways

Semaphore is the permit-based tool for bounding concurrent access to scarce capacity.
It is excellent for resource slots, downstream call throttling, and admission control.
Always pair acquire() with release() in finally.
It limits concurrency naturally, but by itself it does not implement precise time-based rate limiting.

Next post: Binary Semaphore vs Counting Semaphore in Java

Share on

X Facebook LinkedIn Bluesky

Semaphore in Java Deep Dive

Sandeep Bhardwaj

Problem Statement

Mental Model

Core API

Runnable Example

Fair vs Non-Fair Semaphores

Production-Style Use Cases

Common Mistakes

Forgetting `release()` in `finally`

Treating it like a lock

Using indefinite waits everywhere

Confusing concurrency limiting with rate limiting

Testing and Debugging Notes

Performance and Trade-Offs

Decision Guide

Key Takeaways

Share on

Comments

You may also enjoy

Prefix Sum Pattern in Java — A Detailed Guide for Serious Engineers

Sliding Window Technique in Java — A Detailed Guide for Serious Engineers

Java 25 New Features — Complete Guide with Examples

Java 21 New Features — Complete Guide with Examples

Semaphore in Java Deep Dive

Sandeep Bhardwaj

Problem Statement

Mental Model

Core API

Runnable Example

Fair vs Non-Fair Semaphores

Production-Style Use Cases

Common Mistakes

Forgetting release() in finally

Treating it like a lock

Using indefinite waits everywhere

Confusing concurrency limiting with rate limiting

Testing and Debugging Notes

Performance and Trade-Offs

Decision Guide

Key Takeaways

Share on

Comments

You may also enjoy

Prefix Sum Pattern in Java — A Detailed Guide for Serious Engineers

Sliding Window Technique in Java — A Detailed Guide for Serious Engineers

Java 25 New Features — Complete Guide with Examples

Java 21 New Features — Complete Guide with Examples

Forgetting `release()` in `finally`