The context Package: How Go Manages Cancellation Across Goroutines

18 min read

The context Package: How Go Manages Cancellation Across Goroutines

A goroutine is cheap to start and easy to forget about. That second property is the dangerous one. Every goroutine you launch that cannot be stopped is a goroutine leak. It holds memory, it holds references, and it keeps running until the process dies. In a long-lived server handling thousands of requests per second, leaked goroutines accumulate until your service falls over.

The standard fix is the done channel pattern. You pass a channel to the goroutine and close it when you want the goroutine to stop. The goroutine selects on that channel alongside whatever work it's doing.

The done channel pattern

Consider a goroutine that generates integers forever and sends them on a channel:

package main
 
import (
	"fmt"
	"time"
)
 
func generate(done <-chan struct{}, nums chan<- int) {
	i := 0
	for {
		select {
		case <-done:
			return
		case nums <- i:
			i++
		}
	}
}
 
func main() {
	done := make(chan struct{})
	nums := make(chan int)
 
	go generate(done, nums)
 
	for i := 0; i < 5; i++ {
		fmt.Println(<-nums)
	}
 
	close(done)
	time.Sleep(10 * time.Millisecond) // give goroutine time to exit
	fmt.Println("done")
}

This works. The calling code closes done, the goroutine sees the signal and returns. No leak. Cox-Buday covers this pattern extensively in chapter 4 of "Concurrency in Go" under "Preventing Goroutine Leaks," and it is the foundation everything else builds on.

But the done channel has three limitations that become obvious as soon as your call graph gets deeper.

First, there is no cancellation reason. When a goroutine receives on the done channel, it knows it should stop, but it doesn't know why. Was the parent canceled? Did a deadline expire? Was there an error? The goroutine has no way to distinguish these cases.

Second, there is no deadline support. If you want to say "stop after 5 seconds," you need to wire up a time.After channel yourself, merge it with the done channel, and handle the coordination manually. Every function that needs deadline awareness reimplements this.

Third, there is no standard way to pass request-scoped data through the call stack. A request ID, an auth token, a trace span — these need to travel through every function in the chain. Without a standard carrier, you either add parameters to every function signature or use package-level globals. Neither scales.

The done channel solves goroutine leaks. It does not solve cancellation as a system-level concern.

What the context package provides

The context package was introduced to solve exactly these gaps. Cox-Buday argues that it serves two distinct purposes: it provides an API for canceling branches of a call-graph, and it acts as a data-bag for transporting request-scoped data through the call stack.

Cancellation itself breaks down into three dimensions. A parent goroutine needs to cancel its children — this is the most common case, triggered when a user disconnects or a handler returns. A child goroutine needs to decide it should stop on its own — perhaps it hit an unrecoverable error. And blocking operations need to become preemptible — a database query or HTTP call needs to respect the cancellation signal rather than blocking indefinitely.

The context package addresses all three with a single abstraction. Every function that might block or spawn goroutines accepts a context.Context as its first parameter. That context carries a done signal, an optional deadline, an optional error explaining why it was canceled, and an optional bag of key-value pairs. One parameter replaces the done channel, the deadline timer, and the request-scoped data arguments.

The Context interface

The context.Context interface has four methods:

  • Deadline() (deadline time.Time, ok bool) — returns the time at which this context will be canceled, if a deadline was set. The ok return value is false when no deadline exists.
  • Done() <-chan struct{} — returns a channel that is closed when the context is canceled. This is the done channel pattern, wrapped in an interface.
  • Err() error — returns nil while the context is still active. After Done() is closed, it returns either context.Canceled or context.DeadlineExceeded, depending on what caused the cancellation.
  • Value(key any) any — returns the value associated with a key, or nil if no value is set for that key.

The design of these methods is intentional. Done() returns a receive-only channel because that is the primitive Go already uses for goroutine cancellation signaling. It composes naturally with select statements, just like a done channel does. The Err() method is only non-nil after Done() is closed — this is a deliberate invariant. If you call Err() and get a non-nil result, you are guaranteed that the Done() channel is already closed. You never have to check both and wonder about ordering.

Deadline() lets downstream code make intelligent decisions. If a function knows it has 50 milliseconds left, it can decide not to start a database query that typically takes 200 milliseconds. Without this, the function would start the query, block, and then get canceled mid-flight — wasting resources.

Value() is the most controversial method. It's an any-to-any map, which sounds like a type-safety nightmare. The constraints that make it safe are social, not mechanical, and we'll come back to them.

Constructor functions and the immutable tree

You never construct a Context directly. The package provides factory functions that build a tree of contexts, each derived from a parent.

context.Background() returns an empty context with no deadline, no cancellation, and no values. It is the root of any context tree, typically created in main(), in init functions, or at the top of an incoming request handler.

context.TODO() also returns an empty context. It exists as a signal to the reader: "I know I should be threading a context through here, but I haven't plumbed it yet." It's a marker for future work.

context.WithCancel(parent) returns a new context and a cancel function. When you call cancel(), the new context's Done() channel closes. The parent is unaffected.

context.WithDeadline(parent, deadline) returns a new context that will be automatically canceled at the given time, or when the parent is canceled, whichever comes first. It also returns a cancel function you should still call to release resources early if the work finishes before the deadline.

context.WithTimeout(parent, duration) is shorthand for context.WithDeadline(parent, time.Now().Add(duration)).

context.WithValue(parent, key, value) returns a new context carrying the given key-value pair. It does not modify the parent.

The critical design property is immutability. Every With* function creates a new child context. It never mutates the parent. This is what makes context safe to pass across goroutines without synchronization. Multiple goroutines can hold references to the same parent context, each can derive children from it, and none of them can interfere with each other. There are no locks, no atomic operations needed at the user level. The tree structure and immutability guarantee safety.

How cancellation propagates: under the hood

Most developers picture cancellation as the parent pushing a signal down to its children. The real mechanism is more interesting than that. To understand it you need to look at the actual data structures in the standard library.

The cancelCtx struct

Every context created by WithCancel, WithDeadline, or WithTimeout is backed by a cancelCtx. This is what it looks like in the source:

type cancelCtx struct {
    Context
 
    mu       sync.Mutex            // protects following fields
    done     atomic.Value          // of chan struct{}, created lazily, closed by first cancel call
    children map[canceler]struct{} // set to nil by the first cancel call
    err      error                 // set to non-nil by the first cancel call
    cause    error                 // set to non-nil by the first cancel call
}

The embedded Context field points upward to the parent. That is how the tree maintains its shape. The sync.Mutex protects the mutable state: children, err, and cause. The done field is an atomic.Value holding a chan struct{} that gets created lazily the first time someone calls Done(). If nobody ever selects on a context's done channel, that channel is never allocated.

The err field is a plain error protected by the mutex, not an atomic. When Err() is called, it acquires the lock, reads err, and releases the lock. This is a deliberate choice: Err() is not on the hot path the way Done() is. The done channel is what goroutines block on in select statements, so it needs to be lock-free for reads. Err() is only called after the goroutine already knows it has been canceled.

The propagateCancel function

When you call WithCancel(parent), the internal withCancel function creates a new cancelCtx and immediately calls propagateCancel. This function is the heart of the cancellation machinery. It decides how the new child hooks itself to its parent, and it has four distinct code paths.

Path one: parent never cancels. If parent.Done() returns nil, the function returns immediately. There is nothing to hook into. A parent whose Done() is nil will never be canceled, so there is no signal to propagate. This is the path taken by context.Background() and context.TODO(), both of which return nil from Done(). It means that the first WithCancel in a chain derived from Background pays zero overhead for propagation toward the root.

Path two: parent is a cancelCtx. The function calls parentCancelCtx, which tries to unwrap the parent to find the innermost *cancelCtx. This unwrapping walks through layers of valueCtx and other standard wrappers using the internal Value mechanism with a package-private sentinel key &cancelCtxKey. When a cancelCtx receives a Value call with that key, it returns itself. The function then verifies that the Done() channel of the parent matches the one from the unwrapped cancelCtx, to guard against custom implementations that override Done().

If the unwrapping succeeds, the child is added directly to the parent's children map under the parent's mutex. No goroutine is created. This is the fast path and the one taken by the vast majority of contexts in real programs:

if p, ok := parentCancelCtx(parent); ok {
    p.mu.Lock()
    if p.err != nil {
        // parent has already been canceled
        child.cancel(false, p.err, p.cause)
    } else {
        if p.children == nil {
            p.children = make(map[canceler]struct{})
        }
        p.children[child] = struct{}{}
    }
    p.mu.Unlock()
    return
}

Note that parentCancelCtx can traverse through valueCtx wrappers transparently. If you have WithValue(WithCancel(Background(), ...), ...) and then call WithCancel on it, the inner cancelCtx is found and the new child hooks directly into it. No goroutine bridge. This is why wrapping contexts with WithValue is cheap: it does not break the fast path for subsequent cancellation hooks.

Path three: parent implements AfterFunc. Starting in Go 1.21, propagateCancel checks whether the parent implements the afterFuncer interface, which has a single method AfterFunc(func()) func() bool. If it does, the function registers a callback with the parent instead of spawning a goroutine. The callback cancels the child when the parent is done. The parent is then wrapped in a stopCtx that holds the stop function so the callback can be deregistered when the child is canceled or removed. This path exists specifically for third-party context implementations that know how to schedule callbacks efficiently without goroutines.

Path four: the goroutine bridge. If none of the above paths apply, propagateCancel falls back to spawning a goroutine that selects on both the parent's and the child's Done() channels:

go func() {
    select {
    case <-parent.Done():
        child.cancel(false, parent.Err(), Cause(parent))
    case <-child.Done():
    }
}()

This is the expensive path. A live goroutine sits in memory for the entire lifetime of the child context, doing nothing but waiting. This is why custom context implementations that do not expose a cancelCtx internally and do not implement AfterFunc carry a real cost. In hot paths where you create thousands of derived contexts per second, this goroutine bridge adds up. The standard library goes to significant lengths to avoid it.

The cancel function

When cancellation actually fires, the cancel method on cancelCtx executes a precise sequence under the mutex.

First, it checks if err is already non-nil. If so, this context was already canceled and the function returns immediately. Cancellation is idempotent.

Then it sets err and cause. After that, it handles the done channel: if the channel was never created (nobody ever called Done() on this context), it stores the package-level closedchan, a pre-closed channel that is reused across all contexts. If the channel exists, it closes it. This is the broadcast mechanism. A closed channel in Go unblocks every goroutine that is receiving on it, simultaneously. The runtime does not iterate over listeners. It does not do fan-out. The channel close is a single operation that wakes up every goroutine blocked in a select on <-ctx.Done(), whether there are ten or ten thousand of them.

func (c *cancelCtx) cancel(removeFromParent bool, err, cause error) {
    if err == nil {
        panic("context: internal error: missing cancel error")
    }
    if cause == nil {
        cause = err
    }
    c.mu.Lock()
    if c.err != nil {
        c.mu.Unlock()
        return // already canceled
    }
    c.err = err
    c.cause = cause
    d, _ := c.done.Load().(chan struct{})
    if d == nil {
        c.done.Store(closedchan)
    } else {
        close(d)
    }
    for child := range c.children {
        child.cancel(false, err, cause)
    }
    c.children = nil
    c.mu.Unlock()
 
    if removeFromParent {
        removeChild(c.Context, c)
    }
}

After closing the channel, the function iterates over the children map and calls cancel on each child recursively. Note that the child's lock is acquired while the parent's lock is still held. This is safe because locks are always acquired in parent-to-child order, never the reverse. The children map is then set to nil, releasing all references to child contexts and allowing the GC to collect them.

Finally, if removeFromParent is true, the context removes itself from its parent's children map. This is the cleanup step that prevents the parent from holding a reference to a child that is already done.

The mental model

Pointers go upward through the embedded Context field, keeping the chain alive and enabling Value lookups to walk toward the root. The children map points downward, enabling cancellation to cascade from parent to child. The mutex protects mutations to the tree structure. And the closed channel is the broadcast signal: a single runtime operation that wakes every listener without iteration.

There is an asymmetry that matters: canceling a child does not cancel the parent. If you have a parent context with three child branches, canceling one branch leaves the other two and the parent completely unaffected. This lets you model independent sub-tasks that can fail or be canceled individually.

package main
 
import (
	"context"
	"fmt"
	"time"
)
 
func worker(ctx context.Context, name string) {
	for {
		select {
		case <-ctx.Done():
			fmt.Printf("%s stopped: %v\n", name, ctx.Err())
			return
		default:
			fmt.Printf("%s working\n", name)
			time.Sleep(200 * time.Millisecond)
		}
	}
}
 
func main() {
	parent, cancelParent := context.WithCancel(context.Background())
 
	childA, cancelA := context.WithCancel(parent)
	_ = cancelA
 
	childB, _ := context.WithCancel(parent)
 
	go worker(childA, "A")
	go worker(childB, "B")
 
	time.Sleep(500 * time.Millisecond)
 
	// Cancel only child A. B keeps running.
	cancelA()
	time.Sleep(300 * time.Millisecond)
 
	// Cancel parent. B stops because it was derived from parent.
	cancelParent()
	time.Sleep(300 * time.Millisecond)
}

Running this prints A and B both working, then A stops while B continues, then B stops when the parent is canceled. The tree structure determines the blast radius of cancellation.

The progression: from done channel to full context

To see how context replaces and extends the done channel, consider refactoring the integer generator from earlier in three steps.

Step one replaces the done channel with a context:

package main
 
import (
	"context"
	"fmt"
)
 
func generate(ctx context.Context, nums chan<- int) {
	i := 0
	for {
		select {
		case <-ctx.Done():
			return
		case nums <- i:
			i++
		}
	}
}
 
func main() {
	ctx, cancel := context.WithCancel(context.Background())
	nums := make(chan int)
 
	go generate(ctx, nums)
 
	for i := 0; i < 5; i++ {
		fmt.Println(<-nums)
	}
 
	cancel()
}

The shape of the code is identical. The select statement looks the same. The only change is that the done channel is now ctx.Done() and closing is replaced by calling cancel(). But we have gained something: the goroutine can now call ctx.Err() to understand why it was stopped.

Step two adds a deadline. Suppose we want the generator to stop after 1 second regardless:

package main
 
import (
	"context"
	"fmt"
	"time"
)
 
func generate(ctx context.Context, nums chan<- int) {
	i := 0
	for {
		select {
		case <-ctx.Done():
			fmt.Printf("generator stopped: %v\n", ctx.Err())
			return
		case nums <- i:
			i++
			time.Sleep(300 * time.Millisecond)
		}
	}
}
 
func main() {
	ctx, cancel := context.WithTimeout(context.Background(), 1*time.Second)
	defer cancel()
 
	nums := make(chan int)
	go generate(ctx, nums)
 
	for n := range nums {
		fmt.Println(n)
	}
}

The generator function did not change at all. The only change was in main(), swapping WithCancel for WithTimeout. The generator automatically stops after one second and reports context.DeadlineExceeded as the reason. With a raw done channel, you would have needed to wire up a separate timer, merge signals, and track the reason yourself.

Step three adds request-scoped data. Suppose each generation run has a request ID for tracing:

package main
 
import (
	"context"
	"fmt"
	"time"
)
 
type ctxKey struct{}
 
func generate(ctx context.Context, nums chan<- int) {
	reqID, _ := ctx.Value(ctxKey{}).(string)
	i := 0
	for {
		select {
		case <-ctx.Done():
			fmt.Printf("[%s] generator stopped: %v\n", reqID, ctx.Err())
			return
		case nums <- i:
			i++
			time.Sleep(300 * time.Millisecond)
		}
	}
}
 
func main() {
	ctx, cancel := context.WithTimeout(context.Background(), 1*time.Second)
	defer cancel()
 
	ctx = context.WithValue(ctx, ctxKey{}, "req-abc-123")
 
	nums := make(chan int)
	go generate(ctx, nums)
 
	for n := range nums {
		fmt.Println(n)
	}
}

Again, the shape of the generator is barely changed. A single line extracts the request ID from the context. The caller added it with WithValue. No new function parameters. No globals. The value rides along the same context that carries the cancellation signal and the deadline.

This progression illustrates why context works so well: each capability layers on without altering the structure of the functions that consume the context.

Usage rules

The Go team and Cox-Buday converge on a set of rules that have become idiomatic.

Context should always be the first parameter to a function, and it should be named ctx. This is not a style preference; it is a convention that the entire ecosystem relies on. Tools like go vet and linters check for it. When you see ctx context.Context as the first parameter, you immediately know this function is cancellation-aware.

Never store a context in a struct. A context is tied to the lifetime of a specific operation — a request, a task, a pipeline stage. Storing it in a struct implies the struct's lifetime and the context's lifetime are the same, which they almost never are. Pass it explicitly through function calls.

Never pass a nil context. If you don't have a context and aren't sure which one to use, pass context.TODO(). It works identically to context.Background() at runtime, but it signals to future maintainers that the context plumbing is incomplete.

These rules exist because context is infrastructure. Like error handling, it needs to be consistent across the entire codebase to provide its guarantees. A single function that ignores its context breaks the cancellation chain for everything downstream.

The WithValue typing gotcha

The Value() method accepts and returns any. Without discipline, this becomes a stringly-typed mess where two packages accidentally use the same key and overwrite each other's values.

The standard defense is defining an unexported type for your keys:

package middleware
 
type ctxKey struct{}
 
var requestIDKey = ctxKey{}
 
func WithRequestID(ctx context.Context, id string) context.Context {
	return context.WithValue(ctx, requestIDKey, id)
}
 
func RequestID(ctx context.Context) string {
	id, _ := ctx.Value(requestIDKey).(string)
	return id
}

Because ctxKey is unexported, no other package can create a value of that type. Key collisions become structurally impossible, not just unlikely. Each package that stores values in context defines its own key type and provides exported accessor functions. The users of the package never interact with the raw Value() method.

Cox-Buday is clear about what belongs in context values and what doesn't. Request IDs, authentication tokens, trace spans, correlation identifiers — these are data that crosses API and process boundaries, that is inherently request-scoped, and that every layer of the stack might need. These belong in context.

Optional function parameters do not belong in context. If a function behaves differently based on a value, that value should be an explicit parameter with a real type. Hiding behavioral configuration inside context.Value makes the function's contract invisible — you can't see it in the signature, you can't enforce it at compile time, and you can't discover it without reading the implementation.

The values stored in a context should also be safe for concurrent use by multiple goroutines. Since the context itself is passed across goroutine boundaries, anything it carries must be immutable or otherwise thread-safe. In practice, this means strings, integers, and small immutable structs. Not pointers to mutable state.

Context in Go's concurrency architecture

Cox-Buday's chapter on concurrency patterns builds a progression: goroutines and channels provide the primitives, the done channel prevents leaks, pipelines structure data flow, and fan-in/fan-out provides parallelism. Context is the piece that makes all of these patterns manageable in production.

Without context, a pipeline of five stages where each fans out to ten goroutines requires fifty individual done channels, with manually wired deadline logic and no standard way to pass trace data. With context, you create one context at the pipeline's entry point, derive children at each stage, and every goroutine in the entire graph responds to a single cancellation signal. The deadline propagates automatically. The request ID is available everywhere without threading it through every function signature.

This is why context.Context appears as the first parameter in nearly every function in the standard library that does I/O: database/sql, net/http, os/exec, net. It is not because those packages need the value-bag functionality. It is because any operation that blocks needs to be preemptible, and context provides the universal mechanism for preemption.

The deeper insight from the book is that context transforms cancellation from a local concern into a systemic one. A done channel cancels one goroutine. A context cancels an entire branch of the call graph — every goroutine spawned by a handler, every database query, every outbound HTTP call, every pipeline stage. When a client disconnects from your HTTP server, r.Context() is canceled, and that cancellation flows through every layer of your application that accepted the context. No goroutines leak. No database connections sit idle waiting for results nobody will read. No outbound RPCs continue burning upstream resources.

That systemic property — not the convenience of having four methods instead of a raw channel — is what makes context essential to Go's concurrency model. The done channel was the insight. The context package was the engineering that made it scale.