Kotlin coroutines: structured concurrency
Core principle
A well-structured coroutine is a self-contained unit of asynchronous work — single entry, single exit, scoped to a lifecycle known at the call site.
Scopes should usually be tied to the caller's lifecycle, not stored as a property on the callee. A stored
is a strong review signal: the class must prove it owns cancellation, error reporting, restart behavior, and lifecycle. Most repositories, managers, use cases, and data sources cannot prove that, so they should expose
APIs instead.
The fix is almost always the same:
make the API and let the caller own the scope.
When to use this skill
You're writing or reviewing Kotlin code and you see any of these:
- A class with
private val scope: CoroutineScope
(constructor param stored as a property)
- An
init { scope.launch { ... } }
block
- A non-suspending public function whose body is
- in suspend-capable application code, or in tests where should apply
runCatching { suspendCall() }
or a on / around a call without rethrowing
- A
catch (e: CancellationException)
(or equivalent) around suspension that does not rethrow
The silent-cancellation bug
The reason an unowned
property is so dangerous:
once a scope is cancelled, every future on it silently completes as cancelled — no exception, no log, nothing. The work just doesn't happen. This is one of the hardest coroutine bugs to diagnose, and it appears when a class holds a long-lived reference to a lifecycle it does not own.
If APIs are
, this can't happen: the caller's scope is either alive (work runs) or the call site cancels (the caller knows).
Anti-patterns and fixes
1. CoroutineScope stored as a property
kotlin
// ❌ BAD
@Inject
class UserRepository(
private val scope: CoroutineScope,
private val api: UserApi,
) {
fun refresh() {
scope.launch { _state.value = api.fetchUser() }
}
}
// ✅ GOOD
@Inject
class UserRepository(
private val api: UserApi,
) {
suspend fun refresh(): User = api.fetchUser()
}
The repository no longer needs to know about coroutines at all. The caller (a ViewModel, a use case) decides on what scope, with what error handling, with what cancellation semantics.
2. init-block launches
kotlin
// ❌ BAD: construction-time side effect, unbounded work
class UserSession(private val scope: CoroutineScope, private val api: Api) {
init { scope.launch { _user.value = api.load() } }
}
The constructor returns immediately. The caller can't
the load, can't see errors, can't cancel. The class is "alive" but its state is undefined.
kotlin
// ✅ GOOD: explicit bootstrap, caller owns the suspension
class UserSession(private val api: Api) {
private var _user: User? = null
val user: User get() = checkNotNull(_user) { "Call init() first" }
suspend fun init() { _user = api.load() }
}
3. Fire-and-forget from non-UI classes
A non-suspending public function on a non-UI class (repository, manager, use case, data source) that launches into a class-owned scope. The caller gets no result, no error, no cancellation, and no guarantee the work ever ran.
kotlin
// ❌ BAD — repository with stored scope and fire-and-forget public API
class AnalyticsClient(private val scope: CoroutineScope, private val api: Api) {
fun track(event: Event) {
scope.launch { api.send(event) } // caller has no idea what happens
}
fun signOut() {
scope.launch { api.signOut() } // silent failure if scope cancelled
}
}
kotlin
// ✅ GOOD
class AnalyticsClient(private val api: Api) {
suspend fun track(event: Event) = api.send(event)
suspend fun signOut() = api.signOut()
}
Carve-out: the UI ↔ state-holder boundary
UI frameworks are non-suspending. A Composable's
, a Fragment's
, an Activity's
— none can
. The state holder (ViewModel, Decompose Component, feature model, etc. — anything whose role is to absorb UI events and hold UI state)
is the boundary that translates one-shot UI events into asynchronous work bound to the UI lifecycle. That's its job.
kotlin
// ✅ GOOD — state holder absorbs a non-suspending UI event onto its scope
class FavouritesViewModel(private val repo: FavouritesRepository) : ViewModel() {
fun onToggleFavourite(item: Item) {
viewModelScope.launch { repo.toggleFavourite(item) }
}
}
// in Compose:
ListItem(onClick = { viewModel.onToggleFavourite(item) })
This is not the fire-and-forget anti-pattern. All three conditions must hold:
- State holder for a UI surface — a ViewModel, Decompose Component, feature model, or equivalent UI state holder. Not a repository, manager, use case, or data source.
- Lifecycle-bound scope — , a Component's that's cancelled on destroy, a Composable's . Not , not an injected long-lived scope, not an ad-hoc .
- Caller really is a UI event — Composable callback, key handler, lifecycle hook. Not another business-logic class calling through the state holder.
The repository / use case / data source layers underneath still expose
APIs. The state holder is the
only layer where the non-suspending → suspending translation belongs.
"It feels like a state holder" isn't enough. The question is "does the UI directly bind to this?" If no, the carve-out doesn't apply.
4. Stored scopes that aren't injected
The same anti-pattern, without an injected scope:
kotlin
// ❌ BAD — same problem, scope is constructed in-class instead of injected
class FooManager {
private val scope = MainScope()
private val scope = CoroutineScope(Dispatchers.Default + SupervisorJob())
}
Lifecycle is now owned by nothing and lives forever. Replace with
APIs.
The same is true if the instantiation is nested inside a function body —
fun foo() { CoroutineScope(...).launch { … } }
is just a stored scope with extra steps. Each call leaks a new uncancellable scope; bundling it into a
property doesn't fix the underlying issue (the scope shouldn't exist at all).
5. DI-bound singletons / initializers that launch
A specific pattern that is hard to spot: a DI-bound class (
,
, an
) launches a coroutine from its constructor /
block /
. The launched work then has:
- A non-deterministic start time — whenever the graph realizes the binding. Cold-start ordering is invisible.
- No observable lifecycle. Nothing else in the codebase can see whether it's running or has crashed.
- No / restart path. If upstream enters a bad state, the loop is uncancellable.
- No calling code to grep for. Readers can't find "who starts this and when".
§1 says scopes should be tied to the caller's lifecycle. The DI-bound variant violates this indirectly: the scope may be injected, but the launch is hidden inside construction — same effect, harder to see.
kotlin
// ❌ BAD — singleton boots work as a side effect of being constructed
@SingleIn(AppScope::class)
@Inject
class TokenRefresher(
@ForScope(AppScope::class) private val scope: CoroutineScope,
private val auth: AuthService,
) {
init {
scope.launch {
while (isActive) {
delay(5.minutes)
auth.refreshIfNeeded()
}
}
}
}
// ❌ ALSO BAD — Initializer.initialize() that *launches*, not just registers
class TokenInvalidatorInitializer @Inject constructor(
@ForScope(AppScope::class) private val scope: CoroutineScope,
private val store: AuthStore,
private val invalidator: TokenInvalidator,
) : Initializer {
override fun initialize() {
scope.launch { store.tokenChanges.collect { invalidator.invalidate() } }
}
}
Both look like "application-scoped singletons", but the
When NOT to apply carve-out is
not permission to launch from
/
. It's permission for a singleton to own a scope when its API is suspending.
First ask: does this background-loop class need to exist at all?
Most background-loop classes exist only because no one inverted the observation. Three answers, in order of preference:
Pattern 1 — invert into the consumer. The class observes state forever to react when it changes. But someone mutates the state — sign-out flow, profile switch, flag-update handler. That mutation site is already in a coroutine context and is the natural place to do the work directly.
kotlin
// ✅ GOOD — no background loop, no scope, no class. The mutation site does the work.
class Authenticator(
private val authStore: AuthStore,
private val tokenInvalidator: TokenInvalidator,
) {
suspend fun signOut() {
authStore.clearTokens()
tokenInvalidator.invalidate() // direct call at the mutation site
}
}
The background-loop class is deleted. The work happens where the state changes.
When this applies: the consumer of the state has a clear lifecycle (a use case, an Authenticator, a service handler) and can perform the reaction inline.
Pattern 2 — scheduled work. Genuinely periodic or deferred. Use WorkManager / BGTaskScheduler. The enqueue is one-shot; make it suspending and call it once from an orchestrator that already runs at startup.
Pattern 3 — explicit named launch site. Sometimes the consumer is a synchronous API with no observable lifecycle (e.g., OpenTelemetry's
Sampler.shouldSample(...)
, an AIDL stub fanout, a broadcast receiver bridge). The observation has to live somewhere coroutine-aware, but it must live at an
explicit named call site — not in the class's own
.
kotlin
// ✅ GOOD — work is named; an explicit call site owns the launch
@SingleIn(AppScope::class)
class OtelConfigurableSampler(...) : Sampler {
@Volatile private var delegate: Sampler = ...
suspend fun observeRate(featureFlags: FeatureFlags) {
featureFlags.observe(OTEL_SAMPLING_RATE).collect { rate ->
delegate = Sampler.traceIdRatioBased(rate.coerceIn(0.0, 1.0))
}
}
override fun shouldSample(...) = delegate.shouldSample(...)
}
// wired explicitly at the OTel SDK init module:
applicationScope.launch { otelSampler.observeRate(featureFlags) }
When this applies: the consumer is a synchronous API that calls into you with no observable lifecycle. The launch can't be invertible, but it must still be visible at a named call site.
Test for which pattern fits
"Is the consumer's lifecycle observable to me?"
- Yes, and they're already in a coroutine context → Pattern 1. Push the subscription into them; delete the background-loop class.
- The work is periodic / deferred → Pattern 2. Suspend enqueue called once.
- No, they're a synchronous API with no observable lifecycle → Pattern 3. Explicit launch site, not .
If a fourth answer seems to fit — e.g., "I want a
interface that launches everything for me" — that's the same anti-pattern with an extra layer of abstraction. The whole point is that launches be
visible; auto-discovery by interface defeats it.
Initializers are still fine — if they only register
The
pattern is correct when
registers a listener or hook. The bug is when
launches a coroutine.
kotlin
// ✅ GOOD Initializer — registers a contributor, doesn't launch
class FavouritesContributorInitializer @Inject constructor(
private val registry: ContributorRegistry,
private val favouritesContributor: FavouritesContributor,
) : Initializer {
override fun initialize() {
registry.register(favouritesContributor)
}
}
must not a coroutine. If yours does, it's a Pattern 1/2/3 candidate.
Diagnostic for review
- Where is the start moment defined? If "wherever DI realizes me", bad.
- Who can observe whether the work is running? If "no one", bad.
- Who can stop or restart it? If "no one", bad.
- Can a reader grep for the launch site? If no, bad.
If the answers are "the consumer / the orchestrator / the named call site" — you're good.
6. Swallowing
A
clause around a
call that matches
— directly, or through
/
— and doesn't rethrow usually turns cancellation into silent success. The parent coroutine thinks the child finished; the child keeps running (or its side effects do); the cancellation contract is broken.
Same failure shape as §1's stored-scope bug, viewed from the other end: §1 hides the work from the caller's lifecycle; this hides cancellation from the work.
kotlin
// ❌ BAD — catches CancellationException, never rethrows
suspend fun fetch() {
try {
api.load()
} catch (e: Exception) { // matches CancellationException too
logger.warn("load failed", e)
}
}
// ❌ ALSO BAD — runCatching has the same problem
suspend fun fetch() {
runCatching { api.load() }
.onFailure { logger.warn("load failed", it) }
}
The acceptable shapes:
kotlin
// ✅ Separate catch first
try { api.load() }
catch (e: CancellationException) { throw e }
catch (e: Exception) { logger.warn("load failed", e) }
// ✅ Conditional rethrow inside the broad catch
try { api.load() }
catch (e: Exception) {
if (e is CancellationException) throw e
logger.warn("load failed", e)
}
// ✅ ensureActive() — good when the catch handles ordinary failures and you only need
// to rethrow if the current coroutine is cancelled
try { api.load() }
catch (e: Exception) {
currentCoroutineContext().ensureActive()
logger.warn("load failed", e)
}
// ✅ runCatching with explicit guard
runCatching { api.load() }
.onFailure {
if (it is CancellationException) throw it
logger.warn("load failed", it)
}
// ✅ runCatching terminated with getOrThrow (cancellation flows back out)
runCatching { api.load() }.getOrThrow()
The trigger is "a suspend call inside the
", not "the enclosing function is declared
". This applies inside any suspending body —
, a
lambda, a Flow
, etc.
The common carve-out is an intentionally local timeout: catching
TimeoutCancellationException
from your own
and converting it to a domain result can be correct. Keep that catch narrow and close to the timeout. Do not use it as permission to swallow arbitrary cancellation.
Catching a non-cancellation subtype (
, your own exception types) is fine — they don't extend
.
7.
parks the current thread until the lambda finishes. Inside suspend-capable or lifecycle-scoped application paths it is wrong: a thread that meant to be async is now blocked, structured concurrency is broken, and any cancellation upstream has no effect. It is the "callee makes a structural decision for the caller" anti-pattern at its most direct.
kotlin
// ❌ BAD — bridging to suspend by blocking the calling thread
fun saveUser(user: User) {
runBlocking { repository.save(user) }
}
Three fixes, by context:
Suspend-capable application code — make the function
:
kotlin
// ✅ GOOD
suspend fun saveUser(user: User) = repository.save(user)
If the immediate caller can't suspend either (a non-suspending UI callback, a
hook), use the existing lifecycle-bound scope at the boundary — see §3's UI ↔ state-holder carve-out. The fix is at the boundary, not inside
.
Legitimate blocking boundaries exist:
in a CLI tool, Java interop APIs that must return synchronously, framework callbacks with no suspending alternative, and migration shims. Keep
at that outer boundary, keep the body small, and call suspending code immediately.
kotlin
// ❌ BAD — real time, slow tests, no virtual delay
@Test fun loadsUser() = runBlocking {
assertThat(repository.load().name).isEqualTo("Alice")
}
// ✅ GOOD
@Test fun loadsUser() = runTest {
assertThat(repository.load().name).isEqualTo("Alice")
}
gives you virtual time (
returns immediately),
integration, and proper coroutine cleanup. Real-time
in tests makes them slow and flaky.
carve-out — Android's
methods (
,
,
,
,
,
) are synchronous from outside the process. There is no way to suspend them. Inside
member functions of a
subclass (direct or indirect — not companion objects),
is the unavoidable bridge. Keep the body as short as possible and call into suspending code immediately:
kotlin
// ✅ Acceptable in ContentProvider members only
class MyProvider : ContentProvider() {
override fun query(...): Cursor? = runBlocking { dao.query(...) }
}
This carve-out is for
android.content.ContentProvider
subclasses
only. "It's like a
" doesn't apply, and a
in a
's companion object is still a regular violation — the helper isn't part of the framework's synchronous surface.
Quick reference
| Symptom | Anti-pattern | Fix |
|---|
Class has private val scope: CoroutineScope
| Stored scope on the callee | Remove. Make public APIs . |
init { scope.launch { ... } }
| Construction-time launch | Move to / |
fun foo() { scope.launch { ... } }
on a repository/manager/use case | Fire-and-forget from non-UI class | , let UI state holder pick the scope |
fun onClick() { viewModelScope.launch { ... } }
on a state holder, called from UI | UI ↔ state-holder boundary — fine | Keep as-is (see §3 carve-out) |
private val scope = MainScope()
| Internally-constructed stored scope | Same — remove, make APIs |
@SingleIn(AppScope) class X(scope) { init { scope.launch { … } } }
| DI-bound opaque launch (§5) | Expose , launch from startup orchestrator |
class Y : Initializer { override fun initialize() { scope.launch { … } } }
| Initializer that launches, not registers (§5) | Same — , orchestrator owns lifecycle |
try { suspendCall() } catch (e: Exception|Throwable|CancellationException) { … }
with no rethrow | Swallowed cancellation (§6) | Prefer catch (e: CancellationException) { throw e }
; use only when that matches the intent |
runCatching { suspendCall() }.onFailure { … }
with no cancellation guard | Same shape as above (§6) | Add if (it is CancellationException) throw it
, or terminate with |
| inside suspend-capable app code | Thread-blocking bridge (§7) | Make caller ; or use a lifecycle scope at the boundary |
| in a test | Same — real-time bridging (§7) | Use |
| inside a //… member | Carve-out (§7) | Acceptable; keep the body minimal |
Refactoring guidance
Removing an existing offender:
- Start at the leaf. Pick the class farthest from any UI — usually a repository or data source. Its public surface should be the easiest to convert.
- Convert public functions to one at a time. The compiler will surface every caller.
- At each caller, choose the scope deliberately: , , , or an explicit job. This is the choice that was missing before.
- Delete the constructor parameter once nothing uses it. Remove the injection binding.
Don't try to fix every class in one MR. Removing an anti-pattern is incremental work.
When NOT to apply
- UI state holders absorbing UI events. A ViewModel/Component/feature model with
fun onClick(...) { viewModelScope.launch { ... } }
is correct — that's the boundary the framework needs. See §3 carve-out.
- Lifecycle owners with explicit cancellation and error policy. Actors/services, app infrastructure, or application-scoped singletons may own a scope when they expose clear //restart behavior or otherwise map directly to an application lifecycle. Inject
Application.applicationScope
explicitly rather than creating one ad-hoc. This is not permission to launch from / — see §5.
- Already-suspending APIs don't need any of this work.
- Tests sometimes use as a deliberate ambient scope — that's a different pattern with explicit virtual-time control.
Red flags during review
These thoughts mean the anti-pattern is back:
| Thought | Reality |
|---|
"I'll just add a CoroutineExceptionHandler
to the scope" | The problem isn't error handling. The problem is the scope shouldn't exist. |
| "I need to launch from so the data's ready when consumers arrive" | Consumers reading state that isn't ready is the bug. Use phasing. |
| "The caller doesn't want to deal with " | Then the caller chooses fire-and-forget at their scope. Don't decide for them. |
| "It's just a small fire-and-forget call" | Silent cancellation makes every fire-and-forget a potential silent failure. |
| "We caught and logged the exception, so we're fine" | Did the catch rethrow ? If no, the coroutine is silently un-cancelled. (§6) |
| "It's just one , in a non-critical path" | Every asserts the caller has no async option. If they do, it's the wrong primitive. (§7) |
| "Tests are simpler with " | They run in real time, can't fast-forward , and lose semantics. Use . (§7) |
Related
kotlin-flow-state-event-modeling
— , , , , one-shot events, and related modeling.