Knowing that interruption exists is not enough. Long-running tasks need to be designed so interruption actually works in practice.

This post takes the previous interruption article one step further: how to structure loops, retries, and blocking work so cancellation is real rather than decorative.


Problem Statement

A background task may run for:

  • minutes
  • hours
  • the entire lifetime of the service

Examples:

  • polling loops
  • batch processors
  • retry workers
  • queue consumers

If those tasks ignore interruption or respond too slowly, shutdown becomes messy and resource cleanup becomes unreliable.


Naive Version

Here is a bad long-running task:

class BadWorker implements Runnable {
    @Override
    public void run() {
        while (true) {
            doWork();
        }
    }
}

Problems:

  • no exit condition
  • no interruption checks
  • no coordination with shutdown

This is not a manageable production task.


Correct Mental Model

A long-running concurrent task should make these choices explicit:

  1. where can cancellation be observed?
  2. what blocking calls may be interrupted?
  3. what cleanup is required before exit?
  4. what work may be abandoned and what must complete?

Interruption-aware design is not only about syntax. It is about defining a safe stop policy.


Runnable Example

import java.util.concurrent.TimeUnit;

public class InterruptionAwareWorkerDemo {

    public static void main(String[] args) throws Exception {
        Thread worker = new Thread(new BatchWorker(), "batch-worker");
        worker.start();

        TimeUnit.SECONDS.sleep(3);
        worker.interrupt();
        worker.join();
    }

    static final class BatchWorker implements Runnable {
        @Override
        public void run() {
            try {
                while (!Thread.currentThread().isInterrupted()) {
                    fetchBatch();
                    processBatch();
                    waitBeforeNextPoll();
                }
            } finally {
                cleanup();
            }
        }

        void fetchBatch() {
            System.out.println("Fetching batch on " + Thread.currentThread().getName());
        }

        void processBatch() {
            for (int i = 0; i < 5; i++) {
                if (Thread.currentThread().isInterrupted()) {
                    System.out.println("Interrupted during processing");
                    return;
                }
                busyCpu(120);
            }
        }

        void waitBeforeNextPoll() {
            try {
                TimeUnit.MILLISECONDS.sleep(700);
            } catch (InterruptedException e) {
                Thread.currentThread().interrupt();
            }
        }

        void cleanup() {
            System.out.println("Cleaning up worker resources");
        }
    }

    static void busyCpu(long millis) {
        long end = System.nanoTime() + TimeUnit.MILLISECONDS.toNanos(millis);
        while (System.nanoTime() < end) {
            // spin
        }
    }
}

This example shows several important ideas:

  • loop checks interruption at boundary
  • CPU-heavy work checks interruption explicitly
  • blocking wait restores interrupted status
  • cleanup happens in finally

That is much closer to real production shutdown behavior.


Production-Style Example

Imagine a queue consumer responsible for reconciliation jobs. Its shutdown policy may be:

  • finish current item if it is near completion
  • stop taking new work
  • flush any lightweight local metrics
  • exit promptly so deployment can continue

That policy is a design decision. Interruption is only the transport mechanism for the decision.

This is why cancellation and task design cannot be separated.


Failure Modes

Bad long-running task design includes:

  • infinite loops with no interruption check
  • blocking calls that swallow InterruptedException
  • expensive cleanup that never completes
  • doing network or database work in finally without bounded policy

A task that is “correct when uninterrupted” but impossible to stop cleanly is still a poor concurrent design.


When to Exit Immediately vs Gracefully

Not every task should react the same way.

Examples:

  • telemetry poller can often exit immediately
  • durable ledger writer may need to finish a critical section first
  • queue consumer may stop after current item

So the right question is not “should we handle interruption?” The right question is:

  • what is the safe interruption contract for this task?

That contract should be deliberate.


Testing and Debugging Notes

Useful tests:

  1. interrupt the task while idle
  2. interrupt it during blocking wait
  3. interrupt it during active processing
  4. verify cleanup runs
  5. verify shutdown latency stays bounded

If a long-running task has no interruption tests, it is very easy to overestimate how shutdown-ready it actually is.


Decision Guide

For long-running tasks:

  • check interruption at loop boundaries
  • react correctly to InterruptedException
  • decide whether current work must finish or can be abandoned
  • keep cleanup bounded and explicit

Interruption-aware design is really shutdown-aware design.


Key Takeaways

  • interruption only works when long-running tasks are designed to cooperate
  • loops, blocking waits, and cleanup all need a stop policy
  • a task is not production-ready if it cannot be stopped predictably

Next Post

synchronized Methods and Blocks in Java