Introduction

If you’ve done any React work, you might have come across bulletproof-react. It’s an opinionated guide that gives teams a shared understanding of how to structure applications, which patterns to reach for, and where different kinds of logic belongs. It doesn’t invent new ideas so much as curate the ones that have proven themselves in production, then organize them into something coherent.

I wanted the same thing for Rust and Axum. So here we are.

Axum is deliberately minimal. You get a router, extractors, and deep integration with Tower, and then it gets out of the way. That restraint is a genuine strength, but it also means two Axum projects built by two different teams can look completely different. One team puts SQL directly in their handlers. Another builds an elaborate hexagonal architecture with five workspace crates. Neither is wrong, exactly, but the inconsistency makes it harder to onboard new people, harder to maintain things over time, and harder to figure out where a particular piece of logic should live. I’ve seen this play out enough times to know it’s a real problem.

What I’ve tried to do here is provide a curated set of answers to those structural questions. The thinking is informed by hexagonal and clean architecture literature, and by the practical experience I’ve picked up from dozens of blog posts, conference talks, and open-source projects across the Rust web community.

What this guide covers

The book is organized into six sections. Let me walk you through them quickly so you know what’s where.

Architecture lays the foundation. We cover how to organize your files and modules, how to think about layers and dependency direction, and how to model your domain using Rust’s type system so that invalid states become unrepresentable. This is the stuff I wish someone had shown me when I started my first serious Axum project.

Core Patterns tackles the problems every web application has to solve: error handling, database access, application state, and configuration management. You’ll find concrete patterns with real code that you can adapt for your own projects.

HTTP Layer is where we get into the Axum-specific parts of the stack: routing, handlers, middleware composition, request validation, authentication, and API design conventions like pagination and versioning.

Production covers what you need to actually ship and operate the thing: structured logging and tracing, security hardening, testing strategies, performance considerations, and deployment with Docker and graceful shutdown. Because writing code is only half the job, right?

Advanced goes beyond the single-server HTTP model into patterns that show up in production systems with real complexity. We dig into the sharp edges of async Rust (cancellation safety, select! semantics), inter-component communication with channels, coordinated shutdown of multi-subsystem applications, background job management, adding a gRPC surface with Tonic, and compile-time state machines using the typestate pattern. Some of this stuff took me a while to figure out on my own, so I’m hoping it saves you some pain.

Reference gives you a quick-lookup crate table, a catalog of common anti-patterns to watch out for, and a curated list of books, articles, and example repositories if you want to go deeper.

Guiding principles

A few values run through everything I recommend in this book. They’re not original ideas (none of this is), but they’re the ones I keep coming back to.

Keep handlers thin. A handler’s job is to pull data out of the request, call into a service, and turn the result into a response. That’s it. If your handler is doing validation, database queries, and business logic all in one function, it’s doing too much. I’ve seen enough 200-line handlers to know where that road ends. Push that work into the appropriate layer.

Let the type system do the work. Rust’s type system is unusually powerful for a systems language, and in my experience it really shines in web applications. Newtypes that enforce invariants at construction time, Result types that make error paths explicit, trait-based abstractions that decouple your layers from each other. These catch bugs at compile time that would otherwise show up in production at 2am. Worth the upfront effort every single time.

Separate what changes for different reasons. Your HTTP routing might change because you’re adding a new endpoint. Your database queries might change because you’re optimizing a slow path. Your business rules might change because the product requirements evolved. When these concerns live in different modules with clear boundaries, you can change one without worrying about the others. It sounds obvious when you say it out loud, but it’s surprisingly easy to let things bleed together.

Start simple, evolve deliberately. Not every project needs a hexagonal architecture with five workspace crates. A small API with a handful of endpoints can get by with a flat module structure and direct SQLx calls. What matters is understanding the principles behind the more elaborate architectures so you can move toward them incrementally when the complexity of your project actually demands it, rather than over-engineering from the start or scrambling to refactor when things get messy.

Optimize for the reader. Code is read far more often than it’s written. We all know this, but it’s easy to forget in the moment. The patterns in this guide prioritize clarity and predictability over cleverness. When someone new joins your team and opens the codebase for the first time, they should be able to find what they’re looking for and understand why it’s structured the way it is.

How to use this guide

You don’t need to read this cover to cover. If you’re starting a new project, the Architecture section will help you set up a solid foundation. If you’re working on an existing project and want to improve a specific area, just jump straight to the relevant chapter. Each chapter is self-contained enough to be useful on its own, though they reference each other where topics overlap.

I’ve tried to make the code examples realistic. They use the actual crate APIs you’d use in production, with proper error handling and real type signatures. No todo!() placeholders where the hard parts go. Where there are meaningful trade-offs between approaches, I explain both options and give you enough context to make the right call for your situation.

One more thing: this is a living document. The Rust web ecosystem is still evolving fast, and what we consider best practice today will keep getting refined. The principles tend to be more stable than the specific crates or APIs, though, and that’s where I think most of the lasting value here lies. Let’s get into it.

Project Structure

If you’ve ever opened a project and had no idea where to put your new file, you know how much folder structure matters. It sounds like a small thing, but I’ve watched teams burn hours arguing over where code should live, and I’ve seen projects where everything ended up in one giant src/ directory because nobody made a decision early on. Not fun.

There’s no single correct way to structure a Rust web application, but there are patterns that work well in practice. In this chapter we’ll look at two: a single-crate layout for small-to-medium projects, and a workspace layout for when things get bigger. They follow the same underlying principles, just at different scales.

Principles

Before we look at specific directory trees, let me walk you through the ideas that drive them.

Group by architectural layer first, then by concern within each layer. Our primary division is between api, domain, and infra, which enforces the dependency rule we talked about earlier. Within each layer, we organize modules by concern (handlers, models, repositories). Now, I’ll be honest: this still means jumping between api/handlers/users.rs, domain/models/user.rs, and infra/repositories/user_repo.rs when you’re working on a user feature. That’s a real tradeoff. As a codebase grows, some teams find it helpful to organize by feature slice within the layers (like domain/users/model.rs, domain/users/service.rs) rather than by role. Either approach works. What matters most is that the layer boundaries stay clear.

Enforce dependency direction. This one is the big one. Dependencies should flow inward, from the HTTP layer toward the domain. Your domain module should never import anything from api or infra. That’s what keeps your business logic portable and testable in isolation. In a single crate, you enforce this by convention and code review (which, yes, means trusting your team). In a workspace, Cargo enforces it for you through the crate dependency graph, which is much nicer.

Keep main.rs thin. Your entry point should do exactly three things: load configuration, construct the application (wiring up dependencies), and start the server. That’s it. If your main.rs is getting long, that’s usually a sign that setup logic needs to move into its own modules.

Use visibility to create boundaries. Rust’s module visibility system (pub, pub(crate), pub(super), and the default private) is your tool for controlling what code can reach what. Expose only what needs to be public through your mod.rs files, and keep implementation details private. If you’ve come from JavaScript, think of it as the barrel-export pattern, but enforced by the compiler instead of by good intentions.

Single-crate layout

This is what I reach for on most projects. It gives you clean separation between layers without the overhead of managing multiple crates, and you can always evolve it into a workspace later if the project grows. Let’s take a look.

my-app/
├── Cargo.toml
├── .env.example
├── migrations/
│   ├── 20240101_create_users.sql
│   └── 20240102_create_posts.sql
├── src/
│   ├── main.rs
│   ├── lib.rs
│   ├── config.rs
│   ├── error.rs
│   │
│   ├── domain/
│   │   ├── mod.rs
│   │   ├── models/
│   │   │   ├── mod.rs
│   │   │   ├── user.rs
│   │   │   └── post.rs
│   │   ├── errors.rs
│   │   ├── ports/          # Trait definitions (optional, for trait-based abstractions)
│   │   │   └── user_repository.rs
│   │   └── services/
│   │       ├── mod.rs
│   │       ├── user_service.rs
│   │       └── post_service.rs
│   │
│   ├── infra/
│   │   ├── mod.rs
│   │   ├── db.rs
│   │   └── repositories/
│   │       ├── mod.rs
│   │       ├── user_repo.rs
│   │       └── post_repo.rs
│   │
│   └── api/
│       ├── mod.rs
│       ├── routes.rs
│       ├── state.rs
│       ├── handlers/
│       │   ├── mod.rs
│       │   ├── health.rs
│       │   ├── users.rs
│       │   └── posts.rs
│       ├── extractors/
│       │   ├── mod.rs
│       │   ├── validated_json.rs
│       │   └── auth_user.rs
│       ├── middleware/
│       │   ├── mod.rs
│       │   └── auth.rs
│       └── dtos/
│           ├── mod.rs
│           ├── user_dto.rs
│           └── post_dto.rs
│
├── tests/
│   ├── common/
│   │   ├── mod.rs
│   │   └── test_app.rs
│   └── api/
│       ├── health_test.rs
│       └── users_test.rs

Let’s walk through what each piece does.

`main.rs` and `lib.rs`

The main.rs file is our entry point. It loads configuration, initializes tracing, builds the application, and starts the server. That’s all it does, and I want to keep it that way.

use my_app::{config::Config, create_app, observability};

#[tokio::main]
async fn main() -> anyhow::Result<()> {
    let config = Config::from_env()?;
    observability::init_tracing(&config.log_level);

    let app = create_app(config.clone()).await?;

    let addr = format!("0.0.0.0:{}", config.port);
    let listener = tokio::net::TcpListener::bind(&addr).await?;
    tracing::info!("listening on {}", listener.local_addr()?);
    axum::serve(listener, app)
        .with_graceful_shutdown(shutdown_signal()) // defined in the Deployment chapter
        .await?;

    Ok(())
}

The lib.rs file declares our modules and provides the create_app function. This is also what your integration tests will call, which means your tests exercise the exact same wiring as production. I really like this pattern because it means there’s no “test-only” setup path that can silently diverge from the real thing.

#![allow(unused)]
fn main() {
pub mod api;
pub mod config;
pub mod domain;
pub mod error;
pub mod infra;
pub mod observability;

use axum::Router;
use config::Config;

pub async fn create_app(config: Config) -> anyhow::Result<Router> {
    let pool = infra::db::create_pool(
        config.database_url.expose_secret()
    ).await?;
    sqlx::migrate!("./migrations").run(&pool).await?;

    let state = AppState::new(config, pool);
    let router = api::routes::router(state);
    Ok(router)
}
}

The `domain` module

This is the heart of your application, and it should have zero dependencies on Axum, SQLx, or any other framework crate. The only external crates it typically needs are serde (for serialization), thiserror (for error types), and basic utility crates like uuid and chrono. That might seem limiting, but it’s exactly the constraint that keeps this layer honest.

The models/ directory contains your domain entities and value objects, the types that represent the core concepts your application works with. The services/ directory is where the business logic lives, the code that operates on those models. And errors.rs defines domain-specific error types.

Now, what if your domain needs to talk to the outside world? A database, an email service, some third-party API? It defines trait contracts in a ports/ directory. The actual implementations of those traits live over in infra. We’ll see more of this pattern later.

The `infra` module

This is where your database code lives. db.rs handles connection pool creation, and repositories/ contains the concrete implementations of your domain’s repository traits.

Each repository struct holds a reference to the connection pool and implements the corresponding trait from the domain layer. Its job is translating between the database’s row types and the domain’s model types. You might think of it as a translator that speaks both SQL and Rust domain language.

The `api` module

This is the Axum-specific layer. It knows about HTTP concepts like status codes, JSON bodies, headers, and middleware. It does not contain business logic. If you find yourself writing an if statement about business rules in a handler, that’s your cue to move it into a service.

The handlers/ are thin functions that extract data from the request, call a service method, and return a response. I like to keep them really short, ideally under 20 lines. The dtos/ define the shapes of request and response bodies, kept separate from the domain models so that your API contract can evolve independently of your internal representation. The extractors/ contain custom Axum extractors like ValidatedJson or AuthUser (we’ll build those in later chapters). And middleware/ holds any custom Tower middleware.

Finally, routes.rs assembles the router, composing the individual route groups and applying middleware layers.

Workspace layout

At some point your project might grow large enough that compile times start to hurt, or maybe you have multiple teams working on different parts of the system. That’s when I’d consider splitting into a Cargo workspace. The workspace enforces the dependency rules at the crate level: if the domain crate doesn’t list sqlx in its Cargo.toml, no one can accidentally import it there. The compiler has your back.

my-app/
├── Cargo.toml                # [workspace] definition
├── .env.example
├── migrations/
│
├── crates/
│   ├── domain/
│   │   ├── Cargo.toml        # deps: serde, thiserror, uuid, chrono
│   │   └── src/
│   │       ├── lib.rs
│   │       ├── models/
│   │       ├── services/
│   │       ├── ports/        # trait definitions
│   │       └── errors.rs
│   │
│   ├── infra/
│   │   ├── Cargo.toml        # deps: domain, sqlx, reqwest, etc.
│   │   └── src/
│   │       ├── lib.rs
│   │       ├── postgres/
│   │       ├── redis/
│   │       └── email/
│   │
│   ├── api/
│   │   ├── Cargo.toml        # deps: domain, infra, axum, tower, tower-http
│   │   └── src/
│   │       ├── main.rs
│   │       ├── lib.rs
│   │       ├── routes/
│   │       ├── handlers/
│   │       ├── middleware/
│   │       ├── extractors/
│   │       └── dtos/
│   │
│   └── shared/
│       ├── Cargo.toml
│       └── src/
│           ├── lib.rs
│           ├── config.rs
│           └── observability.rs

The root Cargo.toml defines the workspace and, just as importantly, the shared dependency versions. This is one of my favorite Cargo features because it means you don’t end up with three different versions of serde across your crates:

[workspace]
members = ["crates/*"]
resolver = "2"

[workspace.dependencies]
tokio = { version = "1", features = ["full"] }
serde = { version = "1", features = ["derive"] }
serde_json = "1"
sqlx = { version = "0.8", features = ["runtime-tokio-rustls", "postgres", "uuid", "chrono", "migrate"] }
axum = { version = "0.8", features = ["macros"] }
tower = "0.5"
tower-http = { version = "0.6", features = ["cors", "compression-gzip", "trace", "timeout", "request-id", "limit"] }
thiserror = "2"
anyhow = "1"
uuid = { version = "1", features = ["v4", "serde"] }
chrono = { version = "0.4", features = ["serde"] }
tracing = "0.1"
tracing-subscriber = { version = "0.3", features = ["env-filter", "json"] }

Individual crates then inherit from these shared versions with .workspace = true:

# crates/domain/Cargo.toml
[package]
name = "domain"

[dependencies]
serde.workspace = true
thiserror.workspace = true
uuid.workspace = true
chrono.workspace = true
# Note: no axum, no sqlx

The key benefit here is that Cargo itself prevents architectural violations. If someone on your team tries to add a SQLx import in the domain crate, the build just fails because sqlx isn’t in its dependency list. I’ve seen enough bugs from “oops, I accidentally coupled the domain to the database” that I really appreciate having the compiler catch this for me. It’s a much stronger guarantee than relying on code review alone.

When to choose which

My advice: start with the single-crate layout. Seriously. It gives you all the structural benefits of clean separation without the overhead of managing a workspace. You can always split into a workspace later when the need actually arises, and it’s not a particularly difficult refactor because the module boundaries are already in place.

So when would you actually move to a workspace? In my experience, it’s when one of these starts to bite:

Compile times are getting painful and you want incremental builds to only recompile what changed
Multiple teams are working on different parts of the system and you want hard boundaries between their areas
You want to publish some crates independently (for example, a shared domain library used by multiple services)
The project has grown to the point where a single src/ directory feels unwieldy

The workspace layout also works well for monorepo setups where you have multiple services (an API server, a background worker, a CLI tool) that share the same domain and infrastructure code. We’ll touch on that kind of setup in later chapters when we talk about background jobs.

Architectural Patterns

If you’ve ever worked on a codebase where the database query logic was mixed into the HTTP handler, and the business rules lived in three different places, you know how painful that gets over time. I’ve been there more than once, and it always ends the same way: you’re afraid to change anything because you can’t tell what depends on what.

This chapter is about the patterns that prevent that kind of mess. We’ll look at how to structure a Rust web application so the pieces stay clean and independent. I’m not going to prescribe one true way, because the right level of structure depends on your project. But I do want you to understand the options well enough to make that call yourself.

The common thread across all of these patterns is the dependency rule: dependencies point inward, from infrastructure toward the domain. Your business logic should never know about HTTP status codes, SQL queries, or which web framework you’re using. When you get this right, your domain becomes portable, testable, and surprisingly easy to evolve as things change around it.

The layered model

The simplest architecture that actually holds up in production divides your code into three layers. You’ve probably seen some version of this before, but let’s walk through what each layer does and, more importantly, what it should not be doing.

┌──────────────────────────────────────────────┐
│   API Layer                                  │
│   Handlers, routes, middleware, DTOs         │
│   Knows about: Axum, HTTP, JSON              │
├──────────────────────────────────────────────┤
│   Domain Layer                               │
│   Models, services, business rules, ports    │
│   Knows about: nothing external              │
├──────────────────────────────────────────────┤
│   Infrastructure Layer                       │
│   Repositories, database, external APIs      │
│   Knows about: SQLx, Diesel, reqwest, etc.   │
└──────────────────────────────────────────────┘

The API layer is the translation layer between HTTP and your domain. It deserializes request bodies into domain types, calls service methods, and serializes the results back into HTTP responses. Keep it thin. If you find yourself writing business logic in a handler, stop. That logic belongs in the domain layer. I’ve seen enough handlers turn into 200-line monsters to know this is worth being strict about early.

The domain layer is the heart of the whole thing. This is where your business entities live, your value objects, your service functions, and the trait definitions (ports) that describe what the domain needs from the outside world, things like “save a user” or “send a notification.” The key part: this layer has no idea how those capabilities are actually implemented. It doesn’t import sqlx or axum or any other framework crate. It’s just pure business logic.

The infrastructure layer provides the concrete implementations of those domain traits. It knows about specific databases, external HTTP APIs, email providers, message queues, all of that. It depends on the domain layer (because it implements the domain’s traits), but the domain layer never depends on it. That one-way dependency is what makes the whole thing work.

Why this ordering matters

You might look at this and think “okay, that’s a nice diagram, but does it actually matter in practice?” It really does. Let me give you the concrete payoffs.

When your domain doesn’t depend on your database, you can test your business logic by plugging in a simple in-memory implementation of the repository trait. No running PostgreSQL instance needed just to verify that your pricing calculation is correct. Your test suite runs in milliseconds, and you can run it on a plane. I love that.

When your domain doesn’t depend on Axum, you can reuse it in a CLI tool, a background worker, or a gRPC service without dragging the entire HTTP layer along for the ride.

And when your database implementation changes (say you migrate from PostgreSQL to CockroachDB, or you swap SQLx for Diesel), the domain layer doesn’t need to change at all. You rewrite the infrastructure layer and the rest of the application doesn’t even notice. I’ve actually done this migration on a real project, and it went way smoother than I expected precisely because we had this separation in place.

Trait-based abstractions (ports and adapters)

So how do we actually enforce this separation in Rust? Traits. The domain layer defines traits that describe the operations it needs. The infrastructure layer provides structs that implement those traits. And the application layer wires everything together at startup. If you’ve worked with interfaces in other languages, this will feel familiar, but Rust’s trait system gives us some really nice compile-time guarantees on top.

Here is what a repository port looks like in the domain layer:

#![allow(unused)]
fn main() {
// domain/ports/user_repository.rs

use std::future::Future;
use crate::domain::models::{User, UserId, Email};
use crate::domain::errors::CreateUserError;

pub trait UserRepository: Send + Sync + 'static {
    fn create(
        &self,
        req: &CreateUserRequest,
    ) -> impl Future<Output = Result<User, CreateUserError>> + Send;

    fn find_by_id(
        &self,
        id: &UserId,
    ) -> impl Future<Output = Result<Option<User>, anyhow::Error>> + Send;

    fn find_by_email(
        &self,
        email: &Email,
    ) -> impl Future<Output = Result<Option<User>, anyhow::Error>> + Send;
}
}

A few things to note about the trait bounds. The Send + Sync + 'static bounds are there because Axum needs to share state across async tasks that might run on different threads. If you’ve ever fought a “future is not Send” compiler error, this is why those bounds matter. Since Rust 1.75, async fn in traits and return-position impl Trait are stable, so you can write these signatures without the #[async_trait] macro. That was a really welcome change.

There are two limitations to be aware of, though. First, traits that use -> impl Future or async fn are not object-safe, which means you can’t use them as trait objects (dyn Trait). Concretely, Arc<dyn UserRepository> won’t compile. If you need dynamic dispatch (we’ll cover that below), you still need either the async_trait crate or manually boxed futures. The pattern shown here works with static dispatch (generics), which is what I’d recommend as your default.

Second, for public library traits (as opposed to internal application ports), bare async fn is problematic because callers can’t add bounds like Send to the returned future without a breaking API change. If you’re writing a trait that will be consumed by code outside your crate, consider using the trait_variant crate (from the Rust async working group) to generate a Send variant automatically:

#![allow(unused)]
fn main() {
#[trait_variant::make(UserRepositorySend: Send)]
pub trait UserRepository { ... }
}

For internal application ports (which is what most Axum services need), native async traits work great.

One more small thing: notice that Clone is left off the trait definition. Cloneability is a property of the holder (the Arc wrapper, the AppState struct), not a requirement of the domain contract itself. Your concrete repository implementations can still derive Clone if they need to. Keeping Clone out of the trait keeps the contract focused on what the domain actually cares about.

Now let’s look at the concrete implementation in the infrastructure layer:

#![allow(unused)]
fn main() {
// infra/repositories/postgres_user_repo.rs

use domain::ports::UserRepository;

#[derive(Clone)]
pub struct PostgresUserRepo {
    pool: PgPool,
}

impl PostgresUserRepo {
    pub fn new(pool: PgPool) -> Self {
        Self { pool }
    }
}

impl UserRepository for PostgresUserRepo {
    async fn create(&self, req: &CreateUserRequest) -> Result<User, CreateUserError> {
        let row = sqlx::query_as!(
            UserRow,
            r#"INSERT INTO users (id, email, name, password_hash)
               VALUES ($1, $2, $3, $4)
               RETURNING *"#,
            Uuid::new_v4(),
            req.email.as_str(),
            req.name.as_str(),
            req.password_hash,
        )
        .fetch_one(&self.pool)
        .await
        .map_err(|e| match e {
            sqlx::Error::Database(ref db_err) if db_err.is_unique_violation() => {
                CreateUserError::Duplicate { email: req.email.clone() }
            }
            other => CreateUserError::Unknown(other.into()),
        })?;

        Ok(row.into())
    }

    // ... other methods
}
}

Notice how the infrastructure layer translates database-specific errors (like unique constraint violations) into domain-specific errors. The domain error type knows about “duplicate user,” not about SQL error codes. This is a small thing that makes a big difference when you’re debugging at 2 AM, because your error messages actually tell you what went wrong in business terms.

The service layer

Services sit inside the domain layer and orchestrate the business logic. They take the domain’s ports as generic parameters (or trait objects, if you go the dynamic dispatch route), call them in the right order, and enforce business rules. This is where the interesting stuff lives. If you want to understand what the application actually does, you should be able to read the service layer and get the full picture without wading through HTTP concerns or SQL queries.

#![allow(unused)]
fn main() {
// domain/services/user_service.rs

#[derive(Clone)]
pub struct UserService<R: UserRepository> {
    repo: R,
}

impl<R: UserRepository> UserService<R> {
    pub fn new(repo: R) -> Self {
        Self { repo }
    }

    pub async fn register(
        &self,
        name: UserName,
        email: Email,
        password: &str,
    ) -> Result<User, CreateUserError> {
        let password_hash = hash_password(password)?;

        let req = CreateUserRequest {
            name,
            email,
            password_hash,
        };

        self.repo.create(&req).await
    }

    pub async fn get_by_id(&self, id: &UserId) -> Result<Option<User>, anyhow::Error> {
        self.repo.find_by_id(id).await
    }
}
}

The service has no idea that R is a PostgreSQL repository. It could be an in-memory implementation used in tests, or a caching wrapper that checks Redis before hitting the database. The service only knows that whatever R is, it satisfies the UserRepository contract. That’s the whole magic of this pattern, and it’s what makes writing tests for your business logic genuinely pleasant.

Static vs. dynamic dispatch

Once you’re using trait-based abstractions, you’ll run into this question: should I use generics (static dispatch) or trait objects (dynamic dispatch)? It’s worth understanding the tradeoff.

Static dispatch uses generics. The compiler monomorphizes the code for each concrete type, which means there’s no runtime overhead from virtual dispatch. This is the approach we used above with UserService<R: UserRepository>.

#![allow(unused)]
fn main() {
// Static dispatch: zero runtime cost, resolved at compile time
pub struct UserService<R: UserRepository> {
    repo: R,
}
}

Dynamic dispatch uses trait objects behind Arc. The method calls go through a vtable at runtime, which adds a small overhead. In practice this is negligible for web applications where your database round-trips dominate the latency budget anyway. The benefit is simpler type signatures and easier interchangeability at runtime.

Here’s the catch, though: traits that use -> impl Future or async fn are not object-safe in Rust, so you can’t use them with dyn Trait directly. If you need dynamic dispatch, you have two options. The async_trait crate rewrites async methods into boxed futures, making the trait object-safe:

#![allow(unused)]
fn main() {
use async_trait::async_trait;

#[async_trait]
pub trait UserRepository: Send + Sync + 'static {
    async fn create(&self, req: &CreateUserRequest) -> Result<User, CreateUserError>;
    async fn find_by_id(&self, id: UserId) -> anyhow::Result<Option<User>>;
}

// Now this works:
pub struct UserService {
    repo: Arc<dyn UserRepository>,
}
}

You can also write the boxed futures by hand, but honestly the async_trait macro does the same thing with less boilerplate and it’s not worth the extra typing.

My recommendation: start with static dispatch. The compile-time guarantees are stronger, the error messages are clearer, and you sidestep the whole object-safety question. Reach for dynamic dispatch when you genuinely need runtime flexibility, like swapping implementations based on configuration, or when the generic type parameters start making your signatures unwieldy across many layers. You’ll know when you hit that point because you’ll be staring at a type signature that looks like alphabet soup.

Hexagonal architecture and onion architecture

If you’ve spent any time reading about software architecture, you’ve probably bumped into these terms. They sound fancy, but they’re really just variations on the same core idea we’ve been talking about: put the domain at the center, push framework-specific code to the edges.

Hexagonal architecture (also called ports and adapters) emphasizes the symmetry between inbound adapters (things that call into your domain, like HTTP handlers or gRPC servers) and outbound adapters (things your domain calls out to, like databases or email services). The “ports” are the trait definitions, and the “adapters” are the concrete implementations. We’ve already been doing this.

Onion architecture visualizes the same idea as concentric rings. The innermost ring is the domain model. The next ring out is the application/service layer. Then comes the infrastructure layer. And the outermost ring is the presentation layer. Dependencies always point inward.

In practice, these are different names for the same fundamental pattern. The three-layer model we walked through at the beginning of this chapter is a pragmatic implementation of these ideas. You don’t need to adopt any particular naming convention or follow the formal structure to the letter. What I care about (and what you should care about) is that the dependency rule is understood and enforced. The labels are just labels.

Choosing the right level of architecture

I am a pragmatist at heart, and I want to be honest with you: not every project needs a full hexagonal architecture with trait-based ports, generic services, and a five-crate workspace. Over-engineering is a real cost. It adds boilerplate, it increases cognitive overhead, and it slows down development for benefits that may never materialize. I’ve over-engineered plenty of things myself and regretted it every time.

Here’s a rough guide based on what I’ve found works well:

Flat modules are great for prototypes, small APIs, and projects with a handful of endpoints. Put your handlers, models, and queries in separate modules, but don’t bother with traits or service layers. You can always refactor later, and for a small service, the simpler approach is genuinely better.

Layered architecture (single crate) is the sweet spot for most production applications. Separate your domain from your infrastructure, use services to encapsulate business logic, and keep handlers thin. This gives you testability and maintainability without excessive ceremony. If you’re building something that will run in production and be maintained by a team, this is probably where you want to start.

Hexagonal/onion architecture (workspace) makes sense when you have a complex domain with significant business logic, multiple teams working in parallel, or a need to support multiple inbound channels (HTTP, gRPC, CLI, message consumers) against the same domain. The upfront investment is real, but it pays off over the lifetime of a long-lived codebase. I’ve worked on projects like this where having the clean separation saved us weeks of refactoring pain later.

The important thing is to start with clear separation between your layers, even in the simplest structure. The specific patterns you use to enforce that separation can evolve as the project grows. Moving from direct function calls to trait-based abstractions is a pretty straightforward refactor when the module boundaries are already clean. So don’t stress about getting the architecture “perfect” from day one. Get the layers right, and the rest will follow.

Domain Modeling

If you’ve worked on a codebase of any real size, you’ve probably run into this: a function takes three String parameters, someone swaps two of them, and you don’t find out until a user gets a very confusing email. I’ve seen this bug in production more than once, and it’s the kind of thing that makes you wish the compiler could just catch it for you.

Good news: in Rust, it can. Rust’s type system is expressive enough that we can encode our domain rules directly into types, so the compiler rejects bad data before our code ever runs. This chapter walks through the patterns that make that work.

The core idea comes from the functional programming world: parse, don’t validate. Instead of accepting raw strings and then sprinkling boolean checks all over the place, we define types that can only be constructed from valid data. Once you have a value of that type, you know it’s good. No re-checking needed.

The newtype pattern

A newtype is just a struct that wraps a single value. At runtime there’s zero overhead, the compiler treats it the same as the inner value. But at compile time it’s a completely different type, which means the compiler will reject any attempt to use one where the other is expected. That’s a pretty great deal.

Let me show you what I mean. Say your application works with email addresses. You could represent them as String everywhere, but that tells the compiler (and the reader) nothing about what kind of string it is. A function that accepts (String, String) for name and email can be called with the arguments swapped, and nobody will notice until a user gets an email addressed to “alice@example.com” with the subject line “Dear Alice Johnson”. Ask me how I know.

#![allow(unused)]
fn main() {
#[derive(Debug, Clone, PartialEq, Eq, Hash)]
pub struct Email(String);

impl Email {
    /// WARNING: This is a simplified validator for illustration only.
    /// In production, use a proper email validation library or at minimum
    /// a well-tested regex. The `validator` crate's `#[validate(email)]`
    /// attribute handles this at the DTO layer; this constructor is for
    /// the domain layer where you want the type itself to guarantee validity.
    pub fn parse(raw: &str) -> Result<Self, DomainError> {
        let trimmed = raw.trim().to_lowercase();
        if trimmed.contains('@') && trimmed.len() >= 3 {
            Ok(Self(trimmed))
        } else {
            Err(DomainError::InvalidEmail(raw.to_string()))
        }
    }

    pub fn as_str(&self) -> &str {
        &self.0
    }
}

impl AsRef<str> for Email {
    fn as_ref(&self) -> &str {
        &self.0
    }
}
}

A few things worth pointing out here.

The inner String field is private. There’s no way to construct an Email except through parse, which enforces our invariant. And there’s no way to get a mutable reference to the inner string, so the invariant can’t be violated after construction. That’s the whole trick.

The parse method returns a Result, and this is the “parse” in “parse, don’t validate”. It either gives you back a valid Email or an error explaining why the input was rejected. Once you have an Email, you know it’s good. You never need to re-check it when you pass it to another function, which is a really nice property to have.

The AsRef<str> implementation makes the type easy to use in contexts that accept string references, like formatting or database queries, without exposing the ability to modify the inner value.

Building a vocabulary of domain types

Once you get the hang of the newtype pattern, I’d encourage you to apply it to every meaningful concept in your domain. It might feel like a lot of ceremony at first, but it pays off fast. Here’s what a set of domain types might look like for a user management system:

#![allow(unused)]
fn main() {
#[derive(Debug, Clone, PartialEq, Eq, Hash)]
pub struct UserId(Uuid);

impl UserId {
    pub fn new() -> Self {
        Self(Uuid::new_v4())
    }

    pub fn from_uuid(id: Uuid) -> Self {
        Self(id)
    }

    pub fn as_uuid(&self) -> &Uuid {
        &self.0
    }
}

#[derive(Debug, Clone, PartialEq, Eq)]
pub struct UserName(String);

impl UserName {
    pub fn parse(raw: &str) -> Result<Self, DomainError> {
        let trimmed = raw.trim();
        if trimmed.is_empty() {
            return Err(DomainError::EmptyName);
        }
        if trimmed.len() > 100 {
            return Err(DomainError::NameTooLong);
        }
        // Reject characters that could cause problems in downstream systems
        let forbidden = ['/', '(', ')', '"', '<', '>', '\\', '{', '}'];
        if trimmed.chars().any(|c| forbidden.contains(&c)) {
            return Err(DomainError::NameContainsForbiddenCharacters);
        }
        Ok(Self(trimmed.to_string()))
    }

    pub fn as_str(&self) -> &str {
        &self.0
    }
}
}

Now look at what happens to our function signatures:

#![allow(unused)]
fn main() {
// Before: what is the first String? The second? Who knows.
fn create_user(name: String, email: String) -> Result<User, Error> { ... }

// After: the types make it self-documenting and impossible to mix up.
fn create_user(name: UserName, email: Email) -> Result<User, CreateUserError> { ... }
}

The second version isn’t just clearer to read. It’s physically impossible to call it with the arguments in the wrong order. The compiler will reject it. I love this about Rust: you can make whole categories of bugs unrepresentable. Not “caught by a test,” not “caught in code review,” but literally impossible to write.

Domain entities

An entity is a domain object with an identity (usually an ID) and a lifecycle. Think of the things that persist in your system: a user, an order, a blog post. These are your entities, and they tend to be the nouns your product team already talks about.

#![allow(unused)]
fn main() {
#[derive(Debug, Clone)]
pub struct User {
    id: UserId,
    name: UserName,
    email: Email,
    created_at: DateTime<Utc>,
    updated_at: DateTime<Utc>,
}

impl User {
    /// Used by the repository layer when hydrating from the database.
    pub fn hydrate(
        id: UserId,
        name: UserName,
        email: Email,
        created_at: DateTime<Utc>,
        updated_at: DateTime<Utc>,
    ) -> Self {
        Self { id, name, email, created_at, updated_at }
    }

    pub fn id(&self) -> &UserId { &self.id }
    pub fn name(&self) -> &UserName { &self.name }
    pub fn email(&self) -> &Email { &self.email }
    pub fn created_at(&self) -> DateTime<Utc> { self.created_at }
    pub fn updated_at(&self) -> DateTime<Utc> { self.updated_at }
}
}

Notice that the fields are private and the struct only exposes getter methods that return references. This is intentional. If we need to update a user’s name, that should go through a method on the entity (or a service) that can enforce whatever rules apply to name changes. We don’t want random code reaching in and mutating fields directly. I’ve been bitten by that enough times in other languages to be a little paranoid about it.

Request types vs. entity types

Here’s a pattern that might seem obvious once you see it, but I’ve watched teams skip it and regret it later. The data you need to create something is not the same as the thing itself. A CreateUserRequest contains a name, an email, and a password. A User has an ID, timestamps, and no password (it stores a hash instead). These are different concepts, so they should be different types.

#![allow(unused)]
fn main() {
/// The data needed to register a new user.
/// All fields have already been parsed into domain types.
pub struct CreateUserRequest {
    pub name: UserName,
    pub email: Email,
    pub password_hash: String,
}
}

This type lives in the domain layer. It’s separate from the HTTP request DTO (which lives in the API layer) and separate from the database row (which lives in the infrastructure layer). Each layer gets its own types, and the explicit conversions between them keep the boundaries clean. Yes, it’s more types to maintain. But you’ll thank yourself the first time you need to change the API response without touching the database schema.

Conversions between layers

So we have different types in each layer, which means we need to convert between them at the boundaries. The From and TryFrom traits are the idiomatic way to do this in Rust, and they work really well for this.

#![allow(unused)]
fn main() {
// Database row type (lives in infra layer)
#[derive(sqlx::FromRow)]
pub struct UserRow {
    pub id: Uuid,
    pub name: String,
    pub email: String,
    pub password_hash: String,
    pub created_at: DateTime<Utc>,
    pub updated_at: DateTime<Utc>,
}

// Convert from database row to domain entity.
// Using TryFrom rather than From because database values may not satisfy
// current domain invariants: legacy rows, manual SQL fixes, schema migrations,
// and previous bugs can all produce data that would fail validation today.
impl TryFrom<UserRow> for User {
    type Error = anyhow::Error;

    fn try_from(row: UserRow) -> Result<Self, Self::Error> {
        Ok(User::hydrate(
            UserId::from_uuid(row.id),
            UserName::parse(&row.name)
                .map_err(|e| anyhow::anyhow!("corrupt user name in row {}: {}", row.id, e))?,
            Email::parse(&row.email)
                .map_err(|e| anyhow::anyhow!("corrupt email in row {}: {}", row.id, e))?,
            row.created_at,
            row.updated_at,
        ))
    }
}

// API response type (lives in api layer)
#[derive(Serialize)]
pub struct UserResponse {
    pub id: Uuid,
    pub name: String,
    pub email: String,
    pub created_at: DateTime<Utc>,
    pub updated_at: DateTime<Utc>,
}

// Convert from domain entity to API response
impl From<User> for UserResponse {
    fn from(user: User) -> Self {
        Self {
            id: *user.id().as_uuid(),
            name: user.name().as_str().to_string(),
            email: user.email().as_str().to_string(),
            created_at: user.created_at(),
            updated_at: user.updated_at(),
        }
    }
}
}

Notice that the response type excludes the password hash. This is one of my favorite things about having separate types for each boundary: you literally cannot leak sensitive data into the API response because the response type doesn’t have that field. No code review required to catch it, no linter rule to forget about. The struct just doesn’t have it.

Domain errors

Our domain errors should describe business-level failures, not infrastructure-level ones. The domain knows about “user already exists” and “invalid email format.” It doesn’t know (and shouldn’t know) about “SQL unique constraint violation” or “HTTP 409 Conflict.” If you find yourself importing sqlx or axum types in your domain error enum, something has gone sideways.

#![allow(unused)]
fn main() {
#[derive(Debug, thiserror::Error)]
pub enum DomainError {
    #[error("invalid email address: {0}")]
    InvalidEmail(String),

    #[error("name cannot be empty")]
    EmptyName,

    #[error("name exceeds maximum length")]
    NameTooLong,

    #[error("name contains forbidden characters")]
    NameContainsForbiddenCharacters,
}

#[derive(Debug, thiserror::Error)]
pub enum CreateUserError {
    #[error("a user with email {email} already exists")]
    Duplicate { email: Email },

    #[error(transparent)]
    Unknown(#[from] anyhow::Error),
}
}

The CreateUserError::Unknown variant uses anyhow::Error as a catch-all for unexpected failures. This lets the infrastructure layer convert arbitrary errors (like database connection timeouts) into domain errors without the domain having to know about every possible failure mode. The API layer then maps Unknown to a generic 500 response while logging the full details server-side. You might wonder if this is too loose. In my experience, it strikes a good balance: your expected errors are typed, and the unexpected ones still get handled gracefully.

Putting it all together

Let’s trace the complete flow from HTTP request to database and back, so you can see how the types change at each boundary:

HTTP POST /api/users
  { "name": "Alice", "email": "alice@example.com", "password": "hunter2" }

  ↓ Axum deserializes into CreateUserDto (api/dtos)
  ↓ Handler parses fields into domain types: UserName, Email
  ↓ Handler calls user_service.register(name, email, &password)

  ↓ Service hashes password, builds CreateUserRequest (domain)
  ↓ Service calls repo.create(&request)

  ↓ Repository converts to SQL, executes INSERT
  ↓ Repository gets UserRow back from the database
  ↓ Repository converts UserRow into User (domain entity)

  ↑ Service returns User to handler
  ↑ Handler converts User into UserResponse (api/dtos)
  ↑ Handler returns (StatusCode::CREATED, Json(response))

HTTP 201 Created
  { "id": "...", "name": "Alice", "email": "alice@example.com", ... }

At every boundary, the data gets converted into the type that belongs to that layer. The API layer works with DTOs. The domain layer works with domain types. The infrastructure layer works with database row types. The conversions between them are explicit, auditable, and enforced by the compiler.

I know this might feel like a lot of ceremony compared to just passing a single struct all the way through. I felt the same way when I first started doing this. But here’s what I’ve found in practice: when you need to add a field to the API response, you change the DTO. When you need to add a column to the database, you change the row type. When you need to add a business rule, you change the domain type. Each change stays in the layer it belongs to, and the From implementations tell you exactly how the layers connect. Once you’ve worked this way on a project that’s grown past a few thousand lines, it’s hard to go back.

Putting It All Together

If you’ve been reading the earlier chapters, you’ve absorbed a lot of concepts: layered architecture, domain modeling, keeping your layers properly separated. That’s all great in theory. But what does it actually look like when you sit down and build something? That’s what this chapter is for. We’re going to walk through a single feature from start to finish, every file, every type, every connection between the layers. By the end you’ll have seen one complete vertical slice through the entire application, from the HTTP request arriving at the handler, through the domain service and repository port, down to the database, and back up to the API response.

The feature we’re building is user registration: a POST /api/v1/users endpoint that accepts a name, email, and password, validates them, hashes the password, persists the user, and returns the created user. Nothing exotic, but it touches every layer, which is exactly why I picked it.

The files we will create

Before we dive in, here’s where each piece lives in our project structure:

src/
├── domain/
│   ├── models/
│   │   └── user.rs          # Entity, value objects (Email, UserName, UserId)
│   ├── errors.rs             # CreateUserError
│   ├── ports/
│   │   └── user_repository.rs  # The trait (port)
│   └── services/
│       └── user_service.rs   # Business logic
│
├── infra/
│   └── repositories/
│       └── user_repo.rs      # PostgresUserRepo (implements the port)
│
├── api/
│   ├── dtos/
│   │   └── user_dto.rs       # CreateUserDto, UserResponse
│   ├── handlers/
│   │   └── users.rs          # The HTTP handler
│   ├── routes.rs              # Route registration
│   └── state.rs               # AppState
│
├── error.rs                   # AppError (HTTP error mapping)
└── main.rs                    # Wiring everything together

We’ll build each piece starting from the inside (domain) and working our way outward. I find this order helps the most, because by the time you reach the handler, all the types it depends on already exist.

Step 1: Domain models and value objects

These types live in src/domain/models/user.rs. The important thing is that they have zero dependencies on Axum, SQLx, or any framework crate. They enforce their own invariants through private fields and validated constructors, which means you can’t accidentally create a bogus Email or an empty UserName.

#![allow(unused)]
fn main() {
use uuid::Uuid;

#[derive(Debug, Clone, PartialEq, Eq, Hash)]
pub struct UserId(Uuid);

impl UserId {
    pub fn new() -> Self { Self(Uuid::new_v4()) }
    pub fn from_uuid(id: Uuid) -> Self { Self(id) }
    pub fn as_uuid(&self) -> &Uuid { &self.0 }
}

#[derive(Debug, Clone, PartialEq, Eq)]
pub struct Email(String);

impl Email {
    pub fn parse(raw: &str) -> Result<Self, String> {
        let trimmed = raw.trim().to_lowercase();
        if trimmed.contains('@') && trimmed.len() >= 3 {
            Ok(Self(trimmed))
        } else {
            Err(format!("'{}' is not a valid email", raw))
        }
    }
    pub fn as_str(&self) -> &str { &self.0 }
}

#[derive(Debug, Clone, PartialEq, Eq)]
pub struct UserName(String);

impl UserName {
    pub fn parse(raw: &str) -> Result<Self, String> {
        let trimmed = raw.trim();
        if trimmed.is_empty() {
            return Err("name cannot be empty".into());
        }
        if trimmed.len() > 100 {
            return Err("name cannot exceed 100 characters".into());
        }
        Ok(Self(trimmed.to_string()))
    }
    pub fn as_str(&self) -> &str { &self.0 }
}

/// The domain entity. Fields are private; access is through getters.
#[derive(Debug, Clone)]
pub struct User {
    id: UserId,
    name: UserName,
    email: Email,
    created_at: chrono::DateTime<chrono::Utc>,
}

impl User {
    /// Used by the repository when hydrating from the database.
    pub fn hydrate(
        id: UserId,
        name: UserName,
        email: Email,
        created_at: chrono::DateTime<chrono::Utc>,
    ) -> Self {
        Self { id, name, email, created_at }
    }

    pub fn id(&self) -> &UserId { &self.id }
    pub fn name(&self) -> &UserName { &self.name }
    pub fn email(&self) -> &Email { &self.email }
    pub fn created_at(&self) -> chrono::DateTime<chrono::Utc> { self.created_at }
}

/// The data needed to create a new user.
/// By the time this struct exists, all fields have been validated
/// and the password has been hashed.
pub struct NewUser {
    pub name: UserName,
    pub email: Email,
    pub password_hash: String,
}
}

Notice that User has no password_hash field. That’s intentional. The entity represents the public view of a user, the thing we’re comfortable handing back through the API. NewUser is a separate struct that carries the data needed to create one, including the hash. You might wonder why we don’t just slap an Option<String> on User for the password. In my experience, that leads to exactly the kind of bugs where someone accidentally serializes a hash into a JSON response. Two types, two purposes, no accidents.

Step 2: Domain errors

These live in src/domain/errors.rs. They describe business-level failures, the kind of things your product manager would understand, not infrastructure details like “the TCP connection timed out.”

#![allow(unused)]
fn main() {
#[derive(Debug, thiserror::Error)]
pub enum CreateUserError {
    #[error("a user with email {email} already exists")]
    Duplicate { email: Email },

    #[error(transparent)]
    Unknown(#[from] anyhow::Error),
}
}

CreateUserError::Duplicate is a business rule violation: someone tried to register with an email that’s already taken. CreateUserError::Unknown is our escape hatch for unexpected infrastructure failures (database timeouts, connection errors, that sort of thing). The domain doesn’t know or care what specific infrastructure failure happened. It just knows something went sideways.

Step 3: The port (repository trait)

This is the concept that makes the whole layered approach actually work. If you take away one thing from this chapter, let it be this part. A port is a trait defined in the domain layer that describes an operation the domain needs from the outside world. The domain says “I need something that can save users and find them,” but it doesn’t say how. The trait lives in src/domain/ports/user_repository.rs.

#![allow(unused)]
fn main() {
use std::future::Future;
use crate::domain::models::{User, UserId, Email, NewUser};
use crate::domain::errors::CreateUserError;

pub trait UserRepository: Send + Sync + 'static {
    fn create(
        &self,
        new_user: &NewUser,
    ) -> impl Future<Output = Result<User, CreateUserError>> + Send;

    fn find_by_id(
        &self,
        id: &UserId,
    ) -> impl Future<Output = Result<Option<User>, anyhow::Error>> + Send;

    fn find_by_email(
        &self,
        email: &Email,
    ) -> impl Future<Output = Result<Option<User>, anyhow::Error>> + Send;
}
}

Look at the imports. This trait imports only domain types. No PgPool, no sqlx::query, no axum::State. That’s what makes it a port: it defines a boundary between the domain and the infrastructure without being coupled to either side.

The Send + Sync + 'static bounds are there because Axum shares state across async tasks on multiple threads. The trait uses impl Future<...> + Send as the return type, which works with Rust’s native async-in-traits (stable since 1.75) for static dispatch. If you later need dynamic dispatch (Arc<dyn UserRepository>), you’d switch to the async_trait crate, as I explain in the Architectural Patterns chapter.

Why bother with a port at all? I think of it as having two payoffs. First, it lets you test the service layer without a database. You write a simple in-memory implementation of UserRepository for your tests, and the service doesn’t know the difference. Second, it enforces the dependency rule at the module level: the domain can’t accidentally import SQLx because the trait doesn’t reference it. If someone tries to add a PgPool parameter to this file, they’ll quickly realize it doesn’t belong here.

When you don’t need a port: If your application is small, mostly CRUD, and unlikely to have a second implementation of the repository, you can skip the trait and have the service call a concrete repository struct directly. I’ve done this plenty of times for side projects and internal tools. The Anti-Patterns chapter discusses when that simplification makes sense. The port starts earning its keep when you have real business logic to test, when you want to reuse the domain from multiple entry points, or when the team is large enough that architectural guardrails matter.

Step 4: The service

The service lives in src/domain/services/user_service.rs. This is where the actual business logic happens. It calls the repository through the port trait we just defined.

#![allow(unused)]
fn main() {
use crate::domain::models::{User, UserId, Email, UserName, NewUser};
use crate::domain::errors::CreateUserError;
use crate::domain::ports::UserRepository;

#[derive(Clone)]
pub struct UserService<R: UserRepository> {
    repo: R,
}

impl<R: UserRepository> UserService<R> {
    pub fn new(repo: R) -> Self {
        Self { repo }
    }

    pub async fn register(
        &self,
        name: UserName,
        email: Email,
        password: &str,
    ) -> Result<User, CreateUserError> {
        // Hash the password on a blocking thread so we do not
        // tie up the async runtime with CPU-intensive work.
        let password = password.to_string();
        let password_hash = tokio::task::spawn_blocking(move || {
            hash_password(&password)
        })
        .await
        .map_err(|e| CreateUserError::Unknown(e.into()))?
        .map_err(|e| CreateUserError::Unknown(e.into()))?;

        let new_user = NewUser { name, email, password_hash };
        self.repo.create(&new_user).await
    }

    pub async fn get_by_id(&self, id: &UserId) -> Result<Option<User>, anyhow::Error> {
        self.repo.find_by_id(id).await
    }
}
}

The service is generic over R: UserRepository. It has no idea whether R is a real PostgreSQL repository or a test double. It just calls the trait methods and goes about its business. The register method is where the interesting stuff happens: it hashes the password on a blocking thread (you really don’t want bcrypt tying up your async runtime) and constructs the NewUser struct that the repository will persist.

Also notice what the service doesn’t know about. It has no concept of HTTP status codes, JSON serialization, or request/response types. It accepts domain types (UserName, Email) and returns domain types (User, CreateUserError). Translating between HTTP and domain concepts? That’s the handler’s job, and we’ll get to it shortly.

Step 5: The repository implementation

Now we cross the boundary into infrastructure. This file lives in src/infra/repositories/user_repo.rs and implements the port trait using SQLx. This is where the SQL actually lives.

#![allow(unused)]
fn main() {
use anyhow::Context;
use sqlx::PgPool;

use crate::domain::models::{User, UserId, Email, UserName, NewUser};
use crate::domain::errors::CreateUserError;
use crate::domain::ports::UserRepository;

/// The database row type. This is separate from the domain entity
/// because it maps to the database schema, which may differ from
/// the domain's representation.
#[derive(sqlx::FromRow)]
struct UserRow {
    id: uuid::Uuid,
    name: String,
    email: String,
    password_hash: String,
    created_at: chrono::DateTime<chrono::Utc>,
}

/// Convert a database row to a domain entity.
/// Uses TryFrom because stored data may not satisfy current
/// domain invariants (legacy rows, manual SQL fixes, migrations).
impl TryFrom<UserRow> for User {
    type Error = anyhow::Error;

    fn try_from(row: UserRow) -> Result<Self, Self::Error> {
        Ok(User::hydrate(
            UserId::from_uuid(row.id),
            UserName::parse(&row.name)
                .map_err(|e| anyhow::anyhow!("corrupt name in row {}: {}", row.id, e))?,
            Email::parse(&row.email)
                .map_err(|e| anyhow::anyhow!("corrupt email in row {}: {}", row.id, e))?,
            row.created_at,
        ))
    }
}

#[derive(Clone)]
pub struct PostgresUserRepo {
    pool: PgPool,
}

impl PostgresUserRepo {
    pub fn new(pool: PgPool) -> Self {
        Self { pool }
    }
}

impl UserRepository for PostgresUserRepo {
    async fn create(&self, new_user: &NewUser) -> Result<User, CreateUserError> {
        let row = sqlx::query_as!(
            UserRow,
            r#"
            INSERT INTO users (id, email, name, password_hash)
            VALUES ($1, $2, $3, $4)
            RETURNING id, email, name, password_hash, created_at
            "#,
            uuid::Uuid::new_v4(),
            new_user.email.as_str(),
            new_user.name.as_str(),
            &new_user.password_hash,
        )
        .fetch_one(&self.pool)
        .await
        .map_err(|e| match e {
            sqlx::Error::Database(ref db_err) if db_err.is_unique_violation() => {
                CreateUserError::Duplicate { email: new_user.email.clone() }
            }
            other => CreateUserError::Unknown(
                anyhow::anyhow!(other).context("failed to insert user")
            ),
        })?;

        row.try_into()
            .map_err(|e: anyhow::Error| CreateUserError::Unknown(e))
    }

    async fn find_by_id(&self, id: &UserId) -> Result<Option<User>, anyhow::Error> {
        let row = sqlx::query_as!(
            UserRow,
            "SELECT id, email, name, password_hash, created_at FROM users WHERE id = $1",
            id.as_uuid(),
        )
        .fetch_optional(&self.pool)
        .await
        .context("failed to fetch user by id")?;

        row.map(TryInto::try_into).transpose()
    }

    async fn find_by_email(&self, email: &Email) -> Result<Option<User>, anyhow::Error> {
        let row = sqlx::query_as!(
            UserRow,
            "SELECT id, email, name, password_hash, created_at FROM users WHERE email = $1",
            email.as_str(),
        )
        .fetch_optional(&self.pool)
        .await
        .context("failed to fetch user by email")?;

        row.map(TryInto::try_into).transpose()
    }
}
}

This file, along with the wiring in main.rs (which uses PgPool and sqlx::migrate!), are the only places in our entire application that import sqlx. I want to emphasize that because it’s the whole point: the domain and API layers never see SQLx types. The handler never touches a UserRow or a sqlx::Error. Those details stay locked inside the infrastructure layer where they belong. If you ever decide to swap Postgres for something else (unlikely, but it happens), you’d change this file and the domain wouldn’t even blink.

Step 6: DTOs and the AppError

The request and response types live in src/api/dtos/user_dto.rs. These are the shapes that the outside world sees, and they’re deliberately separate from our domain types:

#![allow(unused)]
fn main() {
use serde::{Deserialize, Serialize};
use validator::Validate;

/// What the client sends. Raw strings, validated by the validator crate.
#[derive(Debug, Deserialize, Validate)]
pub struct CreateUserDto {
    #[validate(length(min = 1, max = 100))]
    pub name: String,

    #[validate(email)]
    pub email: String,

    #[validate(length(min = 8))]
    pub password: String,
}

/// What the client receives. No password_hash, no internal IDs.
#[derive(Debug, Serialize)]
pub struct UserResponse {
    pub id: uuid::Uuid,
    pub name: String,
    pub email: String,
    pub created_at: chrono::DateTime<chrono::Utc>,
}

impl From<crate::domain::models::User> for UserResponse {
    fn from(user: crate::domain::models::User) -> Self {
        Self {
            id: *user.id().as_uuid(),
            name: user.name().as_str().to_string(),
            email: user.email().as_str().to_string(),
            created_at: user.created_at(),
        }
    }
}
}

Then we have the AppError in src/error.rs, which is the glue between our domain errors and HTTP. This might look like boilerplate, and honestly it kind of is, but it gives you a single place where you decide “this business error becomes that HTTP status code”:

#![allow(unused)]
fn main() {
use std::collections::HashMap;
use axum::http::StatusCode;
use axum::response::{IntoResponse, Response};
use axum::Json;

#[derive(Debug, thiserror::Error)]
pub enum AppError {
    #[error("not found")]
    NotFound,
    #[error("{0}")]
    Validation(String),
    #[error("unauthorized")]
    Unauthorized,
    #[error("{0}")]
    Conflict(String),
    #[error(transparent)]
    Internal(#[from] anyhow::Error),
}

pub type AppResult<T> = Result<T, AppError>;

impl IntoResponse for AppError {
    fn into_response(self) -> Response {
        let (status, message) = match &self {
            Self::NotFound => (StatusCode::NOT_FOUND, self.to_string()),
            Self::Validation(msg) => (StatusCode::BAD_REQUEST, msg.clone()),
            Self::Unauthorized => (StatusCode::UNAUTHORIZED, self.to_string()),
            Self::Conflict(msg) => (StatusCode::CONFLICT, msg.clone()),
            Self::Internal(e) => {
                tracing::error!(error = ?e, "internal error");
                (StatusCode::INTERNAL_SERVER_ERROR, "internal error".into())
            }
        };
        let body = serde_json::json!({ "error": { "message": message } });
        (status, Json(body)).into_response()
    }
}

// Domain error -> AppError conversion
impl From<crate::domain::errors::CreateUserError> for AppError {
    fn from(err: crate::domain::errors::CreateUserError) -> Self {
        use crate::domain::errors::CreateUserError;
        match err {
            CreateUserError::Duplicate { email } => {
                AppError::Conflict(format!("user with email {} already exists", email.as_str()))
            }
            CreateUserError::Unknown(e) => AppError::Internal(e),
        }
    }
}
}

The handler returns AppResult<T>, and when it contains an Err, Axum calls into_response() to produce the HTTP error. What I like about this pattern is that the handler itself never has to think about status codes for error cases. It just uses ? and the From impls take care of the rest.

Step 7: The handler

The handler lives in src/api/handlers/users.rs. If you’ve been reading carefully, you might expect this to be the simplest file so far. And it is. The pattern is: extract, parse, delegate, respond.

#![allow(unused)]
fn main() {
use axum::{extract::State, http::StatusCode, Json};

use crate::api::dtos::user_dto::{CreateUserDto, UserResponse};
use crate::api::extractors::ValidatedJson;
use crate::api::state::AppState;
use crate::domain::models::{UserName, Email};
use crate::error::{AppError, AppResult};

pub async fn create_user(
    State(state): State<AppState>,
    ValidatedJson(payload): ValidatedJson<CreateUserDto>,
) -> AppResult<(StatusCode, Json<UserResponse>)> {
    // Parse raw strings into domain types.
    // If parsing fails, it becomes a validation error.
    let name = UserName::parse(&payload.name)
        .map_err(|e| AppError::Validation(e))?;
    let email = Email::parse(&payload.email)
        .map_err(|e| AppError::Validation(e))?;

    // Delegate to the service. The ? operator converts
    // CreateUserError -> AppError automatically via the From impl.
    let user = state.user_service
        .register(name, email, &payload.password)
        .await?;

    // Convert the domain entity to a response DTO.
    Ok((StatusCode::CREATED, Json(user.into())))
}
}

Fourteen lines of actual logic. That’s it. No SQL, no password hashing, no business rules. It extracts the request, parses the fields into domain types, calls the service, and returns the response. If anything fails, the ? operator propagates the error through the From chain until it becomes an HTTP response. I find that when handlers stay this thin, they’re almost impossible to get wrong, and the code review basically writes itself.

Step 8: AppState and wiring

The AppState in src/api/state.rs holds all the shared dependencies. Think of it as the bag of stuff our handlers need to do their work:

#![allow(unused)]
fn main() {
use axum::extract::FromRef;
use sqlx::PgPool;
use std::sync::Arc;

use crate::config::Config;
use crate::domain::services::UserService;
use crate::infra::repositories::PostgresUserRepo;

#[derive(Clone, FromRef)]
pub struct AppState {
    pub config: Arc<Config>,
    pub db: PgPool,
    pub user_service: UserService<PostgresUserRepo>,
}
}

And then main.rs wires everything together. This is the moment where all our abstractions meet the real world:

use crate::api::state::AppState;
use crate::config::Config;
use crate::infra::repositories::PostgresUserRepo;
use crate::domain::services::UserService;

#[tokio::main]
async fn main() -> anyhow::Result<()> {
    let config = Config::from_env()?;
    init_tracing(&config.log_level);

    // Infrastructure: create the database pool.
    let pool = create_pool(config.database_url.expose_secret()).await?;
    sqlx::migrate!("./migrations").run(&pool).await?;

    // Infrastructure: create the repository (implements the port).
    let user_repo = PostgresUserRepo::new(pool.clone());

    // Domain: create the service, injecting the repository.
    let user_service = UserService::new(user_repo);

    // API: assemble the state and router.
    let state = AppState {
        config: Arc::new(config.clone()),
        db: pool,
        user_service,
    };

    let app = api::routes::router(state);

    let addr = format!("0.0.0.0:{}", config.port);
    let listener = tokio::net::TcpListener::bind(&addr).await?;
    tracing::info!("listening on {}", listener.local_addr()?);
    axum::serve(listener, app)
        .with_graceful_shutdown(shutdown_signal())
        .await?;

    Ok(())
}

This is our composition root: the one place in the application where we choose concrete types and wire them together. The domain service receives a PostgresUserRepo, but it only knows it as “something that implements UserRepository.” If you ever wanted to swap in a different database (or even a different backing store entirely), you’d change this file and the infra/ module. The domain and API layers wouldn’t need to change at all, which is a really nice property to have when your application starts growing.

The complete data flow

Let’s trace through the whole thing end to end. Here’s what happens when a client sends POST /api/v1/users:

Client sends:  { "name": "Alice", "email": "alice@example.com", "password": "secret123" }

  1. Axum deserializes into CreateUserDto (api/dtos)
  2. ValidatedJson checks: name length, email format, password length
  3. Handler parses fields into domain types: UserName, Email
  4. Handler calls user_service.register(name, email, &password)
  5. Service hashes password on blocking thread
  6. Service builds NewUser and calls repo.create(&new_user)
  7. Repository executes INSERT via sqlx::query_as!
  8. If email is duplicate: sqlx returns unique violation
     → Repository maps to CreateUserError::Duplicate
     → Handler's ? maps to AppError::Conflict
     → Axum returns 409 with error message
  9. If successful: repository gets UserRow back from RETURNING clause
     → TryFrom<UserRow> converts to User (domain entity)
  10. Service returns User to handler
  11. Handler converts User to UserResponse via From trait
  12. Axum serializes to JSON and returns 201 Created

Client receives:  { "id": "...", "name": "Alice", "email": "alice@example.com", "created_at": "..." }

Every boundary is explicit, and I think that’s what makes this approach worth the extra files. The handler works with DTOs and domain types. The service works with domain types and ports. The repository works with SQL and row types. And the From / TryFrom implementations bridge the gaps. No layer reaches into another layer’s internals. If that sounds like a lot of ceremony for one endpoint, you’re not wrong. But once you’ve done it once, every subsequent feature follows the same shape, and the consistency pays for itself quickly.

Scaling up: bigger features, more services

As your application grows, you’ll add more features. Each one follows the same pattern: domain types, a port if needed, a service, a repository, DTOs, and a handler. Once you’ve built two or three features this way, the structure becomes second nature.

At some point you’ll notice that the flat models/, services/, and repositories/ directories start getting crowded. When that happens, you can regroup by feature instead:

src/domain/
├── users/
│   ├── mod.rs
│   ├── model.rs        # User, UserId, Email, UserName, NewUser
│   ├── errors.rs       # CreateUserError
│   ├── port.rs         # UserRepository trait
│   └── service.rs      # UserService
├── posts/
│   ├── mod.rs
│   ├── model.rs
│   ├── errors.rs
│   ├── port.rs
│   └── service.rs

Both layouts follow the same principles we’ve been using throughout this chapter. The choice between them is about file count and how easy it is to find things, not about architecture. My advice: start flat, and regroup by feature when navigating the flat directories starts to annoy you. In my experience that tipping point is somewhere around 5 to 10 domain concepts, but you’ll feel it when you get there.

Error Handling

If you’ve worked on a web service for any length of time, you know that error handling is really about two completely different audiences. Your API consumers need clear feedback when something goes wrong: a proper HTTP status code and a message that actually helps them fix the problem. Your team, on the other hand, needs the full picture: stack traces, error chains, and enough context to track down the bug. The trick is serving both without leaking internal details to the outside world.

Axum makes an interesting design choice here. It requires that all handlers are infallible at the framework level, which means they always have to return a valid HTTP response, even when things go sideways. In practice, you do this by returning Result<T, E> where E implements IntoResponse. When your handler returns Err(e), Axum calls e.into_response() to produce the error response. This gives us full control over how errors look to the client, which is exactly what we want.

The AppError pattern

The idea is straightforward: we create one central error enum that covers all the error cases our API can produce. Every handler returns Result<T, AppError>, and the IntoResponse implementation on AppError takes care of mapping each variant to the right HTTP status code and response body. Let’s look at what this looks like in practice.

#![allow(unused)]
fn main() {
use std::collections::HashMap;
use axum::http::StatusCode;
use axum::response::{IntoResponse, Response};
use axum::Json;

#[derive(Debug, thiserror::Error)]
pub enum AppError {
    #[error("resource not found")]
    NotFound,

    #[error("{0}")]
    Validation(String),

    #[error("validation failed")]
    ValidationFields(HashMap<String, Vec<String>>),

    #[error("authentication required")]
    Unauthorized,

    #[error("insufficient permissions")]
    Forbidden,

    #[error("{0}")]
    Conflict(String),

    #[error(transparent)]
    Internal(#[from] anyhow::Error),
}

impl IntoResponse for AppError {
    fn into_response(self) -> Response {
        let (status, error_type, message) = match &self {
            AppError::NotFound => (
                StatusCode::NOT_FOUND,
                "not_found",
                self.to_string(),
            ),
            AppError::Validation(msg) => (
                StatusCode::BAD_REQUEST,
                "validation_error",
                msg.clone(),
            ),
            AppError::ValidationFields(ref fields) => {
                let body = serde_json::json!({
                    "error": {
                        "type": "validation_error",
                        "message": "request validation failed",
                        "fields": fields,
                    }
                });
                return (StatusCode::BAD_REQUEST, Json(body)).into_response();
            }
            AppError::Unauthorized => (
                StatusCode::UNAUTHORIZED,
                "unauthorized",
                self.to_string(),
            ),
            AppError::Forbidden => (
                StatusCode::FORBIDDEN,
                "forbidden",
                self.to_string(),
            ),
            AppError::Conflict(msg) => (
                StatusCode::CONFLICT,
                "conflict",
                msg.clone(),
            ),
            AppError::Internal(err) => {
                // Log the full error chain for debugging. This is the only
                // place where the real error details are visible.
                tracing::error!(error = ?err, "internal server error");
                (
                    StatusCode::INTERNAL_SERVER_ERROR,
                    "internal_error",
                    "an internal error occurred".to_string(),
                )
            }
        };

        let body = serde_json::json!({
            "error": {
                "type": error_type,
                "message": message,
            }
        });

        (status, Json(body)).into_response()
    }
}
}

The most important piece here is the Internal variant. It wraps anyhow::Error, which can hold any error type along with a chain of context messages. When an internal error happens, we log the full details (the error chain and, if available, a backtrace) but return only a generic message to the client. This is how we prevent information leakage. You really don’t want your database schema details or internal file paths showing up in API responses. I’ve seen that happen in production, and it’s not a fun conversation with the security team.

A type alias for convenience

This might seem like a small thing, but it makes a real difference for readability. We define a type alias to keep our handler signatures clean:

#![allow(unused)]
fn main() {
pub type AppResult<T> = Result<T, AppError>;
}

Now our handlers look like this:

#![allow(unused)]
fn main() {
async fn get_user(
    State(state): State<AppState>,
    Path(id): Path<Uuid>,
) -> AppResult<Json<UserResponse>> {
    let user = state.user_service
        .get_by_id(UserId::from_uuid(id))
        .await?
        .ok_or(AppError::NotFound)?;

    Ok(Json(user.into()))
}
}

Notice how the ? operator just works here. That’s because of the From implementations on AppError, which we’ll look at next.

The error mapping chain

In our layered architecture, errors start at the bottom (the infrastructure layer), pass through the domain layer, and eventually reach the API layer where they become HTTP responses. Each layer has its own error types, and we use From implementations to convert between them.

Here’s what the chain looks like for a “create user” operation:

sqlx::Error
  → CreateUserError (domain)
      → AppError (api)
          → HTTP Response

The infrastructure layer catches the database error and converts it into something the domain understands:

#![allow(unused)]
fn main() {
// In the repository implementation
.map_err(|e| match e {
    sqlx::Error::Database(ref db_err) if db_err.is_unique_violation() => {
        CreateUserError::Duplicate { email: req.email.clone() }
    }
    other => CreateUserError::Unknown(other.into()),
})
}

Then the API layer converts that domain error into an AppError:

#![allow(unused)]
fn main() {
impl From<CreateUserError> for AppError {
    fn from(err: CreateUserError) -> Self {
        match err {
            CreateUserError::Duplicate { email } => {
                AppError::Conflict(
                    format!("a user with email {} already exists", email.as_str())
                )
            }
            CreateUserError::Unknown(inner) => AppError::Internal(inner),
        }
    }
}
}

With these From implementations in place, our handler can just use ? and let the compiler figure out the conversions. No manual mapping, no boilerplate:

#![allow(unused)]
fn main() {
async fn create_user(
    State(state): State<AppState>,
    ValidatedJson(payload): ValidatedJson<CreateUserDto>,
) -> AppResult<(StatusCode, Json<UserResponse>)> {
    let user = state.user_service.register(payload).await?;  // ? handles the full chain
    Ok((StatusCode::CREATED, Json(user.into())))
}
}

thiserror vs. anyhow

You might wonder why we’re using two different error crates. They actually serve complementary purposes, and in my experience, a well-structured application uses both.

thiserror is for errors that your code needs to match on. It generates Display and Error trait implementations from your enum definitions. We use it for domain errors (CreateUserError, AuthError), for the AppError enum, and for any error type where callers need to distinguish between variants and react differently.

anyhow is for errors that your code just needs to propagate. It gives you a single anyhow::Error type that can hold any error, along with context messages that form a chain explaining what went wrong. We use it for the catch-all Internal variant of AppError, and in infrastructure code where we want to add context without defining a new error variant for every possible failure mode.

The rule of thumb I follow: use thiserror at boundaries where callers need to make decisions based on the error variant, and use anyhow for the “everything else” case where the error just needs to be logged and reported.

Adding context with anyhow

The anyhow::Context trait lets you attach human-readable context to errors as they propagate up the call stack. This is one of those things that seems like extra work when you’re writing the code, but it’s incredibly helpful when you’re debugging. It tells you not just what failed, but why the code was trying to do that thing in the first place.

#![allow(unused)]
fn main() {
use anyhow::Context;

pub async fn create_pool(database_url: &str) -> anyhow::Result<PgPool> {
    PgPoolOptions::new()
        .max_connections(10)
        .acquire_timeout(Duration::from_secs(3))
        .connect(database_url)
        .await
        .context("failed to connect to the database")
}
}

When this error reaches the Internal variant of AppError and gets logged, the output will include both the original error (“connection refused”) and the context (“failed to connect to the database”). That’s the difference between staring at a cryptic error message and immediately knowing where to look.

Rules of thumb

Never call unwrap() or expect() in code that handles HTTP requests. A panic in a handler doesn’t just affect that one request. If a panic happens while holding a mutex or other shared resource, it can poison the resource and cascade to other requests. I’ve seen enough bugs from careless unwrap() calls to be pretty firm on this one. Always return errors through the Result type.

Never expose internal error messages to clients. Database errors, file paths, stack traces, and internal service names are all information that an attacker can use. Log them at the error level for your team, and return a generic “internal error” message to the client.

Use thiserror for structured errors and anyhow for catch-all propagation. Don’t use anyhow as your handler’s return type directly, because you lose the ability to map different errors to different HTTP status codes. Use it inside the AppError::Internal variant instead.

Use #[from] for automatic conversions. The thiserror #[from] attribute generates From implementations that let the ? operator convert errors automatically. This keeps your handler code clean and lets you focus on the happy path.

Add context with .context() liberally in infrastructure code. The small cost of writing a context string pays off enormously when you’re debugging something in production at 2 AM and the error log says “failed to persist user registration” instead of just “connection reset by peer”. Trust me on this one.

Database Layer

If you’ve ever worked on a project where SQL queries were scattered across handler functions, you know how quickly that becomes painful. At first it feels productive, you’re shipping features, things work. But a few months in, you’re hunting across dozens of files to find where a column name needs updating, and your tests require a running database for everything. I’ve been there, and it’s not fun.

In this chapter, we’ll build a clean, well-separated database layer using SQLx, which is the most common choice for Axum applications. The principles apply no matter which database crate you end up using, though.

Connection pool setup

You don’t want every incoming request to open a fresh database connection. That means a TCP handshake, TLS negotiation, and possibly authentication, all before you’ve even run your query. Instead, we maintain a pool of pre-established connections that requests can borrow from and return to.

#![allow(unused)]
fn main() {
use anyhow::Context;
use sqlx::postgres::{PgPool, PgPoolOptions};
use std::time::Duration;

pub async fn create_pool(database_url: &str) -> anyhow::Result<PgPool> {
    let pool = PgPoolOptions::new()
        .max_connections(10)
        .min_connections(2)
        .acquire_timeout(Duration::from_secs(3))
        .idle_timeout(Duration::from_secs(600))
        .connect(database_url)
        .await
        .context("failed to connect to the database")?;

    Ok(pool)
}
}

How many connections should you set? It depends on your concurrency and what the database server can handle. PostgreSQL defaults to 100 maximum connections, and each one takes up memory on the server side. A decent starting point is 2 to 4 connections per CPU core on your application server, then adjust from there based on how much time your requests actually spend waiting on the database versus doing other work. On a typical 4-core instance, that’s 8 to 16 connections. If you’re running multiple instances behind a load balancer, don’t forget that the total connections across all of them still has to stay under the database’s limit.

Migrations

Our migrations should be versioned, reproducible, and ideally run automatically at startup. SQLx gives us a migration system that reads .sql files from a directory and applies them in order.

migrations/
├── 20240101000000_create_users.sql
├── 20240102000000_create_posts.sql
└── 20240103000000_add_user_bio.sql

What does a well-written migration look like? I try to include sensible defaults for new columns, use UUIDs instead of auto-incrementing integers for primary keys (which makes ID guessing harder, though authorization is still what actually protects your resources), and store timestamps with timezone information:

CREATE TABLE users (
    id          UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    email       VARCHAR(255) UNIQUE NOT NULL,
    name        VARCHAR(100) NOT NULL,
    password_hash VARCHAR(255) NOT NULL,
    bio         TEXT,
    created_at  TIMESTAMPTZ NOT NULL DEFAULT NOW(),
    updated_at  TIMESTAMPTZ NOT NULL DEFAULT NOW()
);

CREATE INDEX idx_users_email ON users(email);

-- Automatically update the updated_at timestamp on row modification
CREATE OR REPLACE FUNCTION update_updated_at_column()
RETURNS TRIGGER AS $$
BEGIN
    NEW.updated_at = NOW();
    RETURN NEW;
END;
$$ LANGUAGE plpgsql;

CREATE TRIGGER update_users_updated_at
    BEFORE UPDATE ON users
    FOR EACH ROW
    EXECUTE FUNCTION update_updated_at_column();

Why a database trigger for updated_at? Because it keeps things consistent no matter which code path modifies the row. Even if someone applies a manual SQL fix directly to the database (and believe me, that happens), the timestamp still gets updated correctly.

Schema craftsmanship

The migration above is a starting point. There are a few more techniques that are easy to add now in migrations and really annoying to retrofit later. Let me walk through the ones I’ve found most useful.

Case-insensitive uniqueness. If usernames and emails should be unique regardless of casing, enforce that at the database level. PostgreSQL’s citext extension or a COLLATE "und-ci-ai-ks-kc" ICU collation prevents “Alice” and “alice” from both being accepted. I’ve seen bugs from relying on application-level lowercasing, because direct SQL inserts or other services can bypass it entirely.

Reusable trigger helpers. If multiple tables need updated_at behavior, extract the trigger function once and reuse it across tables. That’s exactly what the update_updated_at_column() function above does. Just apply the same trigger to every table that has an updated_at column.

Appropriate index types. For array columns like tags, a GIN index enables fast containment queries (@>, &&). For text search columns, consider tsvector with a GIN index. Standard B-tree indexes are the right choice for equality and range queries on scalar columns.

Deliberate ID type choices. UUIDs are what I recommend by default in this guide, but there are situations where sequential IDs (bigserial) work better, like external specs that expect integers or high-write tables where UUID randomness causes B-tree page splits. UUIDv7 is worth a look for write-heavy tables where you want both uniqueness and index-friendly ordering, since it embeds a timestamp and sorts chronologically. PostgreSQL 18+ supports uuidv7() natively. For PostgreSQL 17 and earlier, you can generate UUIDv7 values in the application layer using the uuid crate with the v7 feature, or use a database extension.

If you want to see these patterns applied to a real application, the launchbadge/realworld-axum-sqlx repository has great migration commentary worth reading through.

Running migrations

I like to run migrations at application startup:

#![allow(unused)]
fn main() {
sqlx::migrate!("./migrations")
    .run(&pool)
    .await
    .context("failed to run database migrations")?;
}

The migrate! macro embeds the migration files into your binary at compile time, so you don’t need to ship the migration directory alongside your application. One less thing to think about during deployment.

Compile-time query checking in CI

Here’s something that might trip you up: SQLx’s query! and query_as! macros validate your SQL against a real database at compile time. That means your CI environment needs database access during builds. There are two ways to handle this.

The approach I’d recommend for CI is offline mode. You run cargo sqlx prepare locally (which connects to a running database and generates a .sqlx/ directory with cached query metadata), then check that directory into version control. In CI, set SQLX_OFFLINE=true and the macros will use the cached metadata instead of connecting to a database. Add cargo sqlx prepare --check as a CI step to verify that the cached metadata stays in sync with your queries and migrations.

The alternative is to spin up a database in CI (via Docker or a CI service) and set DATABASE_URL so the macros can connect directly. It’s simpler to set up but slower and more fragile.

A note on migrations at startup

Running migrations at startup works well for single-instance deployments and during development. But if you’re running multiple instances in production, you should know that all of them will try to run migrations at the same time on startup. SQLx handles this with an advisory lock, so only one instance actually runs them, but in some scenarios (particularly with rolling deployments) a dedicated migration job or Kubernetes init container gives you more control and clearer error handling. We’ll dig into this more in the Deployment chapter.

The repository pattern

This is one of my favorite patterns for keeping database logic organized. A repository is just a struct that owns all database access for a particular domain concept. It holds a reference to the connection pool, and its methods map to the things you actually need to do: creating records, finding them by various criteria, updating, and deleting.

#![allow(unused)]
fn main() {
#[derive(Clone)]
pub struct UserRepository {
    pool: PgPool,
}

impl UserRepository {
    pub fn new(pool: PgPool) -> Self {
        Self { pool }
    }

    pub async fn create(&self, req: &CreateUserRequest) -> Result<User, CreateUserError> {
        let row = sqlx::query_as!(
            UserRow,
            r#"
            INSERT INTO users (id, email, name, password_hash)
            VALUES ($1, $2, $3, $4)
            RETURNING id, email, name, password_hash, bio, created_at, updated_at
            "#,
            Uuid::new_v4(),
            req.email.as_str(),
            req.name.as_str(),
            &req.password_hash,
        )
        .fetch_one(&self.pool)
        .await
        .map_err(|e| match e {
            sqlx::Error::Database(ref db_err) if db_err.is_unique_violation() => {
                CreateUserError::Duplicate { email: req.email.clone() }
            }
            other => CreateUserError::Unknown(other.into()),
        })?;

        row.try_into()
            .map_err(|e: anyhow::Error| CreateUserError::Unknown(e))
    }

    pub async fn find_by_id(&self, id: &UserId) -> anyhow::Result<Option<User>> {
        let row = sqlx::query_as!(
            UserRow,
            r#"
            SELECT id, email, name, password_hash, bio, created_at, updated_at
            FROM users
            WHERE id = $1
            "#,
            id.as_uuid(),
        )
        .fetch_optional(&self.pool)
        .await
        .context("failed to fetch user by id")?;

        row.map(TryInto::try_into).transpose()
    }

    pub async fn find_by_email(&self, email: &Email) -> anyhow::Result<Option<User>> {
        let row = sqlx::query_as!(
            UserRow,
            r#"
            SELECT id, email, name, password_hash, bio, created_at, updated_at
            FROM users
            WHERE email = $1
            "#,
            email.as_str(),
        )
        .fetch_optional(&self.pool)
        .await
        .context("failed to fetch user by email")?;

        row.map(TryInto::try_into).transpose()
    }

    pub async fn update(
        &self,
        id: &UserId,
        name: Option<&UserName>,
        bio: Option<&str>,
    ) -> anyhow::Result<Option<User>> {
        let row = sqlx::query_as!(
            UserRow,
            r#"
            UPDATE users
            SET name = COALESCE($2, name),
                bio = COALESCE($3, bio)
            WHERE id = $1
            RETURNING id, email, name, password_hash, bio, created_at, updated_at
            "#,
            id.as_uuid(),
            name.map(|n| n.as_str()),
            bio,
        )
        .fetch_optional(&self.pool)
        .await
        .context("failed to update user")?;

        row.map(TryInto::try_into).transpose()
    }
}
}

Let me highlight a few patterns worth noticing here.

Compile-time checked queries. The query_as! macro verifies your SQL against the actual database schema at compile time and generates its own type mapping from the query result columns to the struct fields. It doesn’t use the FromRow trait; the mapping is built directly by the macro based on column names and types. If you rename a column in a migration but forget to update the query, the build fails. I find this to be one of SQLx’s most useful features. (If you prefer the non-macro query_as::<_, UserRow>(sql) function, that one does use FromRow at runtime.)

Separate row types. The UserRow struct maps directly to the database columns. Deriving sqlx::FromRow isn’t required for query_as!, but it’s harmless to include and useful if you also use the non-macro query functions elsewhere. The User domain type has newtypes for its fields and possibly different field names. The TryFrom<UserRow> for User conversion bridges the gap, as we covered in the Domain Modeling chapter. We use TryFrom rather than From because stored data might not satisfy current domain invariants.

fetch_optional for lookups. When querying by ID or email, reach for fetch_optional instead of fetch_one. A missing record is a normal case, not an error, and returning Option<User> lets the calling code decide whether “not found” is actually a problem in that particular context.

COALESCE for partial updates. The COALESCE SQL function keeps existing values when the update parameter is NULL, which lets you implement partial updates without needing to read the current values first. There’s a limitation worth knowing about, though: this pattern can’t distinguish between “the client sent null to clear this field” and “the client didn’t include this field.” Both arrive as None at the Rust level, and both get treated as “keep the existing value.” If your API needs to support explicitly clearing optional fields, you’ll need either a sentinel value, a separate “fields to clear” parameter, or a more explicit SQL construction.

Parameterized queries everywhere. Every user-supplied value goes through a .bind() parameter placeholder ($1, $2, etc.), never through string interpolation. This prevents SQL injection by design.

The FromRef pattern for ergonomic access

You might find yourself extracting the full AppState in every handler just to reach the database pool or a repository. That gets tedious fast. The FromRef pattern offers a cleaner way, letting you extract sub-components of your state directly.

#![allow(unused)]
fn main() {
use axum::extract::FromRef;

#[derive(Clone)]
pub struct AppState {
    pub db: PgPool,
    pub user_repo: UserRepository,
    // ... other fields
}

impl FromRef<AppState> for UserRepository {
    fn from_ref(state: &AppState) -> Self {
        state.user_repo.clone()
    }
}

impl FromRef<AppState> for PgPool {
    fn from_ref(state: &AppState) -> Self {
        state.db.clone()
    }
}
}

Now our handlers can extract just the repository they need:

#![allow(unused)]
fn main() {
async fn get_user(
    State(repo): State<UserRepository>,
    Path(id): Path<Uuid>,
) -> AppResult<Json<UserResponse>> {
    let user = repo.find_by_id(&UserId::from_uuid(id))
        .await?
        .ok_or(AppError::NotFound)?;
    Ok(Json(user.into()))
}
}

I like this because it keeps handler signatures focused on what the handler actually uses, instead of pulling in the entire application state every time.

Trait-based repositories for testability

For simpler cases, a concrete UserRepository struct with SQLx calls works perfectly fine. But if you want to unit-test your service layer without spinning up a database, you can define the repository as a trait and provide both a real implementation and a test one.

We’ll cover this in detail in the Architectural Patterns chapter. The short version: define a trait in your domain layer, implement it with SQLx in your infrastructure layer, and use generics or trait objects in your service layer. Your tests can then provide a simple in-memory implementation that just stores data in a HashMap.

Transactions

When you have multiple database writes that need to succeed or fail together, wrap them in a transaction:

#![allow(unused)]
fn main() {
pub async fn transfer_credits(
    &self,
    from: &UserId,
    to: &UserId,
    amount: i64,
) -> anyhow::Result<()> {
    let mut tx = self.pool.begin().await.context("failed to begin transaction")?;

    sqlx::query!(
        "UPDATE accounts SET credits = credits - $1 WHERE user_id = $2",
        amount,
        from.as_uuid(),
    )
    .execute(&mut *tx)
    .await
    .context("failed to debit source account")?;

    sqlx::query!(
        "UPDATE accounts SET credits = credits + $1 WHERE user_id = $2",
        amount,
        to.as_uuid(),
    )
    .execute(&mut *tx)
    .await
    .context("failed to credit target account")?;

    tx.commit().await.context("failed to commit transfer")?;

    Ok(())
}
}

If any step fails (or the function returns early via ?), the transaction is automatically rolled back when the tx variable is dropped. No cleanup code needed.

If you want the entire handler to run inside a single transaction that commits on success and rolls back on error, take a look at the axum-sqlx-tx crate. It provides an extractor that handles this for you, which is really convenient for request-scoped transactions.

Choosing a database crate

You might wonder which database crate to use. The Rust ecosystem has several solid options, each with different trade-offs.

SQLx is what most people reach for with Axum. It’s async-native, supports compile-time query checking, and works with raw SQL. You write the SQL yourself, which gives you full control and avoids the impedance mismatch that ORMs sometimes introduce. The compile-time checking catches column type mismatches, missing columns, and syntax errors before your code ever runs.

Diesel is the oldest and most mature Rust ORM. It gives you a type-safe query builder that catches errors at compile time through the type system rather than by connecting to the database. Diesel was traditionally synchronous, but the diesel-async crate provides async support. It’s a good choice if you prefer a query builder over raw SQL and want strong compile-time guarantees.

SeaORM takes an ActiveRecord-inspired approach with async support built in. It generates entity types from your database schema and provides a high-level query API. It’s well-suited for rapid development and projects where writing raw SQL feels like overkill, though you get less fine-grained control than with SQLx.

For most new Axum projects, I’d go with SQLx as the default. It’s a natural fit with Axum’s philosophy of being explicit and composable rather than magical, and the compile-time query checking becomes a real productivity boost once you get used to it.

State Management

If you’ve built even a small web service, you know the problem: your handlers need access to shared things like a database pool, configuration, service instances, maybe an in-memory cache. Every request needs some or all of that, and you need a way to get it there without passing it through seventeen function arguments. Axum gives us a type-safe way to handle this through the State extractor, and once you get the hang of it, it becomes second nature.

The AppState struct

The pattern I reach for every time is a single struct that holds everything our handlers need. We pass it to the router via .with_state(), and Axum takes care of the rest. One thing to keep in mind: the struct has to implement Clone, because Axum clones it for each handler invocation.

#![allow(unused)]
fn main() {
use sqlx::PgPool;
use std::sync::Arc;

#[derive(Clone)]
pub struct AppState {
    pub config: Arc<Config>,
    pub db: PgPool,
    pub user_service: UserService<PostgresUserRepo>,
    pub post_service: PostService<PostgresPostRepo>,
}

impl AppState {
    pub fn new(config: Config, pool: PgPool) -> Self {
        let user_repo = PostgresUserRepo::new(pool.clone());
        let post_repo = PostgresPostRepo::new(pool.clone());

        Self {
            config: Arc::new(config),
            db: pool,
            user_service: UserService::new(user_repo),
            post_service: PostService::new(post_repo),
        }
    }
}
}

Let me walk through a few design decisions here.

We wrap Config in Arc because it’s read-only after startup and could be fairly large. Arc gives us cheap clones (it just bumps a reference count) instead of deep-copying the whole config struct every time AppState gets cloned. That difference matters when you’re handling thousands of requests per second.

PgPool is already reference-counted internally, so cloning it is cheap too. You’ll find that most connection pool types in the Rust ecosystem work the same way.

The services and repositories get constructed once at startup, then shared across all requests. I think of this as the composition root of our application, the one place where we wire all the dependencies together. If you need to swap in a test double for a repository, this is where you’d do it.

Extracting state in handlers

So how do our handlers actually get at the state? Through the State extractor:

#![allow(unused)]
fn main() {
async fn list_users(
    State(state): State<AppState>,
) -> AppResult<Json<Vec<UserResponse>>> {
    let users = state.user_service.list_all().await?;
    Ok(Json(users.into_iter().map(Into::into).collect()))
}
}

Pretty straightforward, right? But you might notice that every handler receives the entire AppState, even if it only touches one field. For a small application, that’s totally fine. Once things grow, though, you’ll probably want something more granular.

Sub-state with FromRef

This is where FromRef comes in. It lets you extract individual fields from your state directly, so the handler doesn’t need to know about the full AppState struct at all. I find this really helpful for keeping handler signatures focused and limiting what each handler can actually reach.

Axum gives us a derive macro that generates the FromRef implementations for each field automatically:

#![allow(unused)]
fn main() {
use axum::extract::FromRef;

#[derive(Clone, FromRef)]
pub struct AppState {
    config: Arc<Config>,
    db: PgPool,
    user_service: UserService<PostgresUserRepo>,
    post_service: PostService<PostgresPostRepo>,
}
}

That one derive saves us from writing a manual impl FromRef<AppState> for ... block for every single field. It generates an implementation for each field’s type, using the field name to find it in the struct.

Now a handler that only needs the user service can ask for just that:

#![allow(unused)]
fn main() {
async fn get_user(
    State(user_service): State<UserService<PostgresUserRepo>>,
    Path(id): Path<Uuid>,
) -> AppResult<Json<UserResponse>> {
    let user = user_service.get_by_id(UserId::from_uuid(id))
        .await?
        .ok_or(AppError::NotFound)?;
    Ok(Json(user.into()))
}
}

This pattern plays nicely with custom extractors too. For example, our AuthUser extractor (which we’ll build in the Authentication chapter) can pull the JWT secret from the state via FromRef without knowing about any other fields. Clean separation.

State vs. Extension

You might wonder why Axum has both State and Extension for passing data to handlers. I’ve seen people reach for Extension out of habit (maybe from other frameworks), so let me explain when to use which.

State is type-checked at compile time. If you forget to call .with_state(), or if the types don’t match, the compiler catches it. That alone makes it my default choice.

Extension stores values in a type-map on the request itself. It’s only checked at runtime, which means a mismatch shows up as a 500 error when an actual request hits the handler. Not great. But extensions are genuinely useful for values that middleware attaches on a per-request basis, like an authenticated user extracted from a JWT.

So here’s how I think about it: use State for application-wide dependencies that you set up at startup (config, database pools, services). Use Extension for per-request data that middleware produces on the fly.

Avoiding common pitfalls

Before we move on, let me share a few things that have bitten me (or people on my team) in production.

Don’t put Mutex<T> in your state unless you’ve really thought it through. I’ve seen enough bugs from this one. A mutex held across an .await point can deadlock or cause brutal contention in async code. If you need shared mutable state (like an in-memory cache), reach for tokio::sync::RwLock which is async-aware, or better yet, use a crate like moka or dashmap that’s built for concurrent access.

Be intentional about what goes into AppState. It’s tempting to keep tossing fields in there as the application grows, but a state struct with dozens of fields gets hard to reason about fast. If your state is getting unwieldy, that’s usually a signal that your app could benefit from splitting into more focused modules, each with their own state type that you compose together at the router level.

Construct your state once, at startup. Don’t create new service or repository instances inside handlers. The whole point of the AppState pattern is that we wire dependencies together in one place (our main function or a setup function it calls) and then share them immutably across all requests. Once you internalize that, the structure of your application becomes much easier to follow.

Configuration

I’ve seen enough configuration bugs to last a lifetime. A database URL that works on your laptop but not in staging. A JWT secret that ends up in a git commit. An environment variable that’s missing, but you only find out ten minutes after deployment when the first request hits a code path that needs it. These are all avoidable, and the strategy we’ll use here prevents every single one of them.

The idea is straightforward: pull configuration from the environment, validate it eagerly at startup, and keep sensitive values protected in memory. Let’s look at how that works in practice.

A strongly-typed config struct

We start with a Rust struct that represents all the configuration our application needs. This is one of those places where Rust’s type system is a natural fit: you get compile-time guarantees about which fields exist and what types they have. And you can validate values when the struct is constructed, instead of hoping they’re correct when something finally tries to use them.

#![allow(unused)]
fn main() {
use secrecy::{SecretString, ExposeSecret};
use serde::Deserialize;

#[derive(Debug, Clone, Deserialize)]
pub struct Config {
    #[serde(default = "default_port")]
    pub port: u16,

    #[serde(default = "default_environment")]
    pub environment: Environment,

    pub database_url: SecretString,

    pub jwt_secret: SecretString,

    #[serde(default = "default_log_level")]
    pub log_level: String,

    #[serde(default)]
    pub cors_origins: Vec<String>,
}

#[derive(Debug, Deserialize, Clone, PartialEq)]
#[serde(rename_all = "lowercase")]
pub enum Environment {
    Development,
    Staging,
    Production,
}

fn default_port() -> u16 { 3000 }
fn default_environment() -> Environment { Environment::Development }
fn default_log_level() -> String { "info".to_string() }
}

There are a few deliberate choices here that are worth walking through.

Notice that database_url and jwt_secret use SecretString from the secrecy crate instead of plain String. This does two things for us: the values get automatically redacted when you print the struct with Debug (you’ll see [REDACTED] instead of the actual secret), and the memory is zeroed when the value is dropped. I’ve seen production secrets end up in log files more times than I’d like to admit, so this kind of protection is really helpful.

The Environment enum is a proper type rather than a string. You can match on it exhaustively, and the compiler will tell you if you forget to handle a variant. It also catches typos like “prodduction” at parse time instead of letting them slip through silently.

We provide default values for fields that have sensible defaults. That way you can start the application with minimal configuration during development, while still requiring explicit values for things like database credentials.

Loading from environment variables

The config crate gives us a flexible system for loading configuration from multiple sources and merging them together. For most applications, loading from environment variables is all you need:

#![allow(unused)]
fn main() {
impl Config {
    pub fn from_env() -> anyhow::Result<Self> {
        // In development, load from a .env file if present.
        // The .ok() is intentional: in production there is no .env file,
        // and that is fine.
        dotenvy::dotenv().ok();

        let config = config::Config::builder()
            .add_source(
                config::Environment::default()
                    .separator("__")
            )
            .build()
            .context("failed to build configuration")?;

        let parsed: Config = config
            .try_deserialize()
            .context("failed to deserialize configuration")?;

        parsed.validate()?;

        Ok(parsed)
    }

    fn validate(&self) -> anyhow::Result<()> {
        if self.jwt_secret.expose_secret().len() < 48 {
            anyhow::bail!(
                "JWT_SECRET must be at least 48 characters for adequate security"
            );
        }
        Ok(())
    }
}
}

The __ separator means you can set nested config values using double-underscore notation in environment variables. So if you had a nested database.max_connections field, you’d set it with DATABASE__MAX_CONNECTIONS.

The validate method runs checks that we can’t express through types alone. Here it makes sure the JWT secret is long enough to resist brute-force attacks. If validation fails, the application exits immediately with a clear error message. That’s way better than starting up fine and then crashing when someone tries to authenticate and the code discovers it has a three-character JWT secret.

The .env file for development

During local development, it’s convenient to store configuration in a .env file so you don’t have to export environment variables manually every time you start the application:

DATABASE_URL=postgres://localhost:5432/myapp_dev
JWT_SECRET=this-is-a-local-dev-secret-that-is-long-enough-for-validation
LOG_LEVEL=debug
ENVIRONMENT=development

This file must be in your .gitignore. Don’t commit it to version control, even if it only contains development secrets. In my experience, the habit of committing .env files always leads to someone accidentally pushing production secrets eventually. Instead, provide a .env.example file that documents which variables are expected:

# Copy this file to .env and fill in the values
DATABASE_URL=postgres://localhost:5432/myapp_dev
JWT_SECRET=<generate a random string of at least 48 characters>
LOG_LEVEL=info
ENVIRONMENT=development

Using secrets safely

The secrecy crate is small, but it pulls a lot of weight. When you wrap a value in SecretString, you get three protections:

Debug redaction. Any code that prints the config struct, whether on purpose or by accident, will see [REDACTED] instead of the actual secret. This matters more than you might think. Structured logging and error reporting can easily serialize the entire config if you’re not careful.
Secure memory zeroing. When the SecretString is dropped, its memory is overwritten with zeros before being deallocated. This shrinks the window during which the secret is sitting around in a memory dump.
Intentional exposure. To actually use the secret value, you have to call .expose_secret(), which returns a reference to the inner string. This makes every secret access explicit and easy to grep for during code review.

#![allow(unused)]
fn main() {
// When you need to use the secret, you explicitly expose it.
// Keep the exposed value's lifetime as short as possible.
let decoding_key = DecodingKey::from_secret(
    config.jwt_secret.expose_secret().as_bytes()
);
}

The rule of thumb: call expose_secret() at the point of use, not earlier. Don’t store the exposed value in a variable that hangs around longer than it needs to, and never log it.

Production deployment

In production, your configuration should come from the deployment platform’s secrets management, not from files. That could be:

Kubernetes secrets mounted as environment variables
Docker environment blocks in your compose file or orchestrator config
Cloud provider secret managers (AWS Secrets Manager, GCP Secret Manager, etc.)

The nice thing about our approach is that the application code doesn’t need to change between environments. It always reads from environment variables. The only difference is how those variables get set: a .env file in development, platform-managed secrets in production.

Fail fast

If there’s one rule I’d drill into every team, it’s this: fail immediately and loudly at startup if any required configuration is missing or invalid. A crash at startup with a clear error message (“JWT_SECRET environment variable is not set”) is so much better than a crash ten minutes later when the first authentication request comes in and you discover there’s no JWT secret.

This is exactly why our Config::from_env() function deserializes and validates eagerly. By the time the application starts accepting requests, we know that all configuration is present, correctly typed, and passes validation. There are no latent configuration bugs waiting to surprise you at 2am on a Saturday. With that foundation in place, let’s look at how we handle errors across the rest of our application.

Routing and Handlers

Every HTTP request that hits your application needs a place to land, and in Axum that place is a handler. A handler is just an async function that takes zero or more extractors as arguments and returns something that implements IntoResponse. Axum handles the plumbing for you: it deserializes request data into the extractor types and serializes your return value back into an HTTP response.

The thing I want you to take away from this chapter early on is that handlers should be thin. Their job is to pull data out of the request, hand it off to a service or repository, and turn the result into a response. If you find yourself writing business logic, database queries, or complex branching inside a handler, that code belongs somewhere else. We’ll talk about where exactly in the domain and infrastructure chapters, but for now just remember: thin handlers, always.

Defining routes

Axum’s router uses method chaining, which feels pretty natural once you’ve seen it a couple of times. You define individual routes with .route(), group related routes under a common prefix with .nest(), and combine separate route groups with .merge().

#![allow(unused)]
fn main() {
use axum::{routing::{get, post, put, delete}, Router};

pub fn router(state: AppState) -> Router {
    Router::new()
        .nest("/api/v1", api_routes())
        .route("/health", get(health_check))
        .route("/health/ready", get(readiness_check))
        .with_state(state)
}

fn api_routes() -> Router<AppState> {
    Router::new()
        .nest("/users", user_routes())
        .nest("/posts", post_routes())
}

fn user_routes() -> Router<AppState> {
    Router::new()
        .route("/", get(list_users).post(create_user))
        .route("/{id}", get(get_user).patch(update_user).delete(delete_user))
}

fn post_routes() -> Router<AppState> {
    Router::new()
        .route("/", get(list_posts).post(create_post))
        .route("/{id}", get(get_post).patch(update_post).delete(delete_post))
}
}

I like splitting routes into separate functions (or even separate files) because it keeps the router definition readable as your application grows. Each function returns a Router<AppState>, and the main router composes them together. This is also where you’d apply route-specific middleware, which we’ll cover in the Middleware chapter.

Writing handlers

So what does a well-structured handler actually look like? In my experience, they all follow the same rhythm: extract, delegate, respond.

#![allow(unused)]
fn main() {
use axum::{extract::{State, Path}, http::StatusCode, Json};

async fn create_user(
    State(state): State<AppState>,
    ValidatedJson(payload): ValidatedJson<CreateUserDto>,
) -> AppResult<(StatusCode, Json<UserResponse>)> {
    let name = UserName::parse(&payload.name)
        .map_err(|e| AppError::Validation(e.to_string()))?;
    let email = Email::parse(&payload.email)
        .map_err(|e| AppError::Validation(e.to_string()))?;

    let user = state.user_service
        .register(name, email, &payload.password)
        .await?;

    Ok((StatusCode::CREATED, Json(user.into())))
}

async fn get_user(
    State(state): State<AppState>,
    Path(id): Path<Uuid>,
) -> AppResult<Json<UserResponse>> {
    let user = state.user_service
        .get_by_id(UserId::from_uuid(id))
        .await?
        .ok_or(AppError::NotFound)?;

    Ok(Json(user.into()))
}

async fn list_users(
    State(state): State<AppState>,
    Query(pagination): Query<PaginationParams>,
) -> AppResult<Json<PaginatedResponse<UserResponse>>> {
    let (users, total) = state.user_service
        .list(pagination.page, pagination.per_page)
        .await?;

    let response = PaginatedResponse {
        data: users.into_iter().map(Into::into).collect(),
        meta: PaginationMeta {
            page: pagination.page,
            per_page: pagination.per_page,
            total,
            total_pages: (total as f64 / pagination.per_page as f64).ceil() as u32,
        },
    };

    Ok(Json(response))
}

async fn delete_user(
    State(state): State<AppState>,
    Path(id): Path<Uuid>,
) -> AppResult<StatusCode> {
    state.user_service
        .delete(UserId::from_uuid(id))
        .await?;

    Ok(StatusCode::NO_CONTENT)
}
}

Take a look at the conventions here. When we successfully create something, we return StatusCode::CREATED (201) with the created resource in the body. A successful deletion returns StatusCode::NO_CONTENT (204) with no body, because there’s nothing left to show. And when a lookup finds nothing, we return AppError::NotFound, which our IntoResponse implementation on AppError turns into a proper 404 response. You might notice how little work each handler actually does. That’s the goal.

Extractors

Extractors are how Axum pulls data out of an incoming request for you. Under the hood, they implement either FromRequest (if they need to consume the request body) or FromRequestParts (if they only need headers, query parameters, path segments, or other metadata without touching the body).

The built-in extractors cover most of what you’ll need day to day:

#![allow(unused)]
fn main() {
use axum::extract::{State, Path, Query, Json};

// Path parameters: /users/{id}
async fn get_user(Path(id): Path<Uuid>) -> ... { }

// Query parameters: /users?page=2&per_page=20
async fn list_users(Query(params): Query<PaginationParams>) -> ... { }

// JSON body
async fn create_user(Json(body): Json<CreateUserDto>) -> ... { }

// Application state
async fn handler(State(state): State<AppState>) -> ... { }
}

You can absolutely use multiple extractors in the same handler. The one constraint to keep in mind is that at most one extractor can consume the request body (like Json), and it has to come last in the argument list. If you mix up the order, the compiler will let you know.

The `#[debug_handler]` macro

If you’ve ever had a handler fail to compile because of trait bound errors, you know the pain. The error messages from the Rust compiler can be really opaque in this context, and I’ve lost more time than I’d like to admit staring at them. Axum provides a #[debug_handler] macro that makes things much clearer:

#![allow(unused)]
fn main() {
#[axum::debug_handler]
async fn my_handler(
    State(state): State<AppState>,
    Json(body): Json<CreateUserDto>,
) -> AppResult<Json<UserResponse>> {
    // ...
}
}

What it does is add extra type checking that produces human-readable errors like “argument #2 must implement FromRequest” instead of a wall of trait bound failures. It has no runtime cost, so you can leave it in place in production if you want. Some teams prefer to remove it once the handler compiles correctly, but honestly I don’t bother. It’s one of those small things that saves you real time when you come back to change a handler six months later.

Response types

Axum is pretty flexible about what your handlers can return. Anything that implements IntoResponse works, and there are quite a few built-in implementations. Here are the patterns I use most often:

#![allow(unused)]
fn main() {
// Just a status code
async fn health_check() -> StatusCode {
    StatusCode::OK
}

// A tuple of status code and body
async fn create_resource() -> (StatusCode, Json<Resource>) {
    (StatusCode::CREATED, Json(resource))
}

// A Result for fallible operations
async fn get_resource() -> Result<Json<Resource>, AppError> {
    Ok(Json(resource))
}

// Headers and body together
async fn with_headers() -> (StatusCode, [(HeaderName, &'static str); 1], Json<Data>) {
    (
        StatusCode::OK,
        [(header::CACHE_CONTROL, "max-age=3600")],
        Json(data),
    )
}
}

In practice, you’ll use the Result return type most of the time, because most handlers can fail in one way or another. Using Result<T, AppError> gives you the clean error mapping we’ll build in the Error Handling chapter, and it keeps your handler code focused on the happy path.

Keeping handlers thin

I know I already mentioned this, but it’s worth repeating because I’ve seen this go wrong so many times across different languages and frameworks. When a handler starts growing beyond 15 or 20 lines, that’s usually a sign that logic has crept into the wrong place.

What should live in a handler:

Extracting data from the request (via Axum extractors)
Parsing raw input into domain types (calling UserName::parse(), Email::parse(), etc.)
Calling a single service method
Converting the result to an HTTP response

What should not live in a handler:

Database queries
Business rule enforcement
Calls to multiple services that need to be coordinated
Complex conditional logic
Sending emails, publishing events, or other side effects

If your handler needs to do several things in coordination, that coordination logic belongs in a service method. We’ll look at how to structure those services in the next chapter.

Middleware

Every web application ends up with a bunch of logic that doesn’t belong in any single handler but needs to run on many (or all) requests. Logging, authentication, compression, rate limiting, request IDs, timeouts. You know the list. In Axum, we handle all of this through middleware, and because Axum’s middleware system is built on Tower, we get access to a huge ecosystem of pre-built components plus a clean way to compose our own.

The thing that trips people up most often isn’t writing middleware. It’s getting the ordering right. Let’s start there.

The Tower layer model

Tower thinks of middleware as “layers” that wrap a service. When a request comes in, it passes through each layer in order, from the outermost to the innermost (which is your handler). The response then travels back out through the layers in reverse. So the first layer you add is the first to see the request and the last to see the response.

Request → Compression → Tracing → Timeout → Auth → Handler
Response ← Compression ← Tracing ← Timeout ← Auth ← Handler

This ordering matters more than you might expect. If you want tracing to record how long the entire request took, including time spent in authentication, the tracing layer has to be outside the auth layer. Get it backwards and your timing data will be wrong in subtle ways that are annoying to debug.

A recommended middleware stack

Let’s look at a middleware stack I’d reach for in a production application, built with ServiceBuilder:

#![allow(unused)]
fn main() {
use tower::ServiceBuilder;
use tower_http::{
    compression::CompressionLayer,
    cors::CorsLayer,
    limit::RequestBodyLimitLayer,
    request_id::{MakeRequestUuid, SetRequestIdLayer, PropagateRequestIdLayer},
    timeout::TimeoutLayer,
    trace::TraceLayer,
};
use std::time::Duration;

let app = Router::new()
    .merge(api_routes())
    .layer(
        ServiceBuilder::new()
            // Layers execute top-to-bottom on the request path.
            .layer(CompressionLayer::new())
            .layer(SetRequestIdLayer::x_request_id(MakeRequestUuid))
            .layer(TraceLayer::new_for_http())
            .layer(PropagateRequestIdLayer::x_request_id())
            .layer(TimeoutLayer::new(Duration::from_secs(30)))
            .layer(RequestBodyLimitLayer::new(1024 * 1024)) // 1 MB
            .layer(cors_layer())
    )
    .with_state(state);
}

Let’s walk through what each layer does and why it sits where it does.

CompressionLayer compresses response bodies using gzip (or brotli, or deflate, depending on what the client supports) and sets the Content-Encoding header. We put it at the outermost position so it compresses the final response body after all inner layers have finished producing it. Worth noting: compression applies to the body only, not to HTTP headers.

SetRequestIdLayer generates a unique UUID and attaches it to the incoming request. We put it before TraceLayer so the request ID is already available when the tracing span gets created.

TraceLayer records structured log entries for each request, including the method, path, status code, and duration. Because it runs after the request ID is set, our tracing span can include the request ID, which is incredibly useful when you’re digging through logs later.

PropagateRequestIdLayer copies the request ID onto the outgoing response. It sits after TraceLayer so the response header is set before tracing finalizes the response side of the span. This ordering (set, then trace, then propagate) follows tower-http’s own documentation and ensures that request IDs show up consistently in both request logs and response headers.

TimeoutLayer cancels requests that take longer than the specified duration. This protects against slow clients, runaway queries, and other situations where a request just hangs forever.

RequestBodyLimitLayer rejects request bodies that exceed a size limit. It’s a basic defense against denial-of-service attacks where a client sends an enormous payload.

CorsLayer handles Cross-Origin Resource Sharing headers. Its exact position in the stack is less critical than the others, but it needs to be applied to the routes that serve your API.

CORS configuration

CORS deserves its own section because misconfiguring it is one of those things that will have you pulling your hair out. You know the symptom: requests work perfectly from Postman but fail mysteriously from the browser. And on the flip side, allowing all origins in production is a real security hole.

#![allow(unused)]
fn main() {
use tower_http::cors::CorsLayer;
use http::{Method, header};

fn cors_layer(config: &Config) -> CorsLayer {
    if config.environment == Environment::Development {
        CorsLayer::permissive()
    } else {
        let origins: Vec<HeaderValue> = config.cors_origins
            .iter()
            .map(|o| o.parse().expect("invalid CORS origin in config"))
            .collect();

        CorsLayer::new()
            .allow_origin(origins)
            .allow_methods([
                Method::GET,
                Method::POST,
                Method::PUT,
                Method::DELETE,
            ])
            .allow_headers([
                header::CONTENT_TYPE,
                header::AUTHORIZATION,
            ])
            .allow_credentials(true)
    }
}
}

One thing I want to call out: we’re driving the CORS policy from runtime configuration, not from the build profile. Using cfg!(debug_assertions) would be wrong here. Think about it: a release build deployed to a staging environment should still have relaxed CORS, and a debug build running against production data should still be restrictive. The environment and allowed origins come from the Config struct we set up in the Configuration chapter.

Writing custom middleware with `from_fn`

Most of the time, you don’t need middleware that’s reusable across projects. You just need something that runs on requests in this application. For that, Axum gives us middleware::from_fn, which lets you write middleware as a plain async function. It’s so much simpler than implementing the full Tower Layer and Service traits, and in my experience it covers the vast majority of custom middleware needs.

#![allow(unused)]
fn main() {
use axum::{
    extract::Request,
    middleware::Next,
    response::Response,
};

async fn timing_middleware(req: Request, next: Next) -> Response {
    let start = std::time::Instant::now();
    let method = req.method().clone();
    let uri = req.uri().clone();

    let response = next.run(req).await;

    let duration = start.elapsed();
    tracing::info!(
        method = %method,
        uri = %uri,
        status = %response.status(),
        duration_ms = %duration.as_millis(),
        "request completed"
    );

    response
}
}

Applying it to your router is straightforward:

#![allow(unused)]
fn main() {
use axum::middleware;

let app = Router::new()
    .merge(api_routes())
    .layer(middleware::from_fn(timing_middleware))
    .with_state(state);
}

But what if your middleware needs access to the application state, say, to look up a JWT secret for authentication? That’s where from_fn_with_state comes in:

#![allow(unused)]
fn main() {
async fn auth_middleware(
    State(state): State<AppState>,
    mut req: Request,
    next: Next,
) -> Result<Response, AppError> {
    let token = req.headers()
        .get(header::AUTHORIZATION)
        .and_then(|v| v.to_str().ok())
        .and_then(|v| v.strip_prefix("Bearer "))
        .ok_or(AppError::Unauthorized)?;

    let claims = decode_jwt(token, state.config.jwt_secret.expose_secret())
        .map_err(|_| AppError::Unauthorized)?;

    // Store the authenticated user in request extensions so handlers can access it
    req.extensions_mut().insert(AuthUser::from(claims));

    Ok(next.run(req).await)
}

// Apply to specific routes
let protected_routes = Router::new()
    .route("/profile", get(get_profile))
    .route("/settings", put(update_settings))
    .route_layer(middleware::from_fn_with_state(state.clone(), auth_middleware));
}

`route_layer` vs. `layer`

This distinction trips up a lot of people, and it really does matter for correctness.

.layer() applies the middleware to all routes on the router, including fallback handlers for unmatched paths. If you put an authentication layer here, even 404 responses for nonexistent paths will require authentication. That’s usually not what you want, and it leads to confusing behavior where unauthenticated users get a 401 instead of a 404 for paths that don’t exist.

.route_layer() applies the middleware only to routes that actually matched. Unmatched paths pass through to the fallback handler without hitting the middleware at all. This is what you want for authentication and authorization.

#![allow(unused)]
fn main() {
let app = Router::new()
    // Public routes
    .route("/health", get(health_check))
    .route("/api/v1/auth/login", post(login))
    // Protected routes
    .nest("/api/v1", protected_routes)
    // Global middleware (applied to everything)
    .layer(TraceLayer::new_for_http())
    .with_state(state);

let protected_routes = Router::new()
    .nest("/users", user_routes())
    .nest("/posts", post_routes())
    // Auth middleware only applies to matched routes
    .route_layer(middleware::from_fn_with_state(state, auth_middleware));
}

Available middleware crates

Before you write custom middleware, it’s worth checking whether someone has already solved your problem. The Tower and tower-http ecosystems are surprisingly rich:

tower-http includes CorsLayer, CompressionLayer, TraceLayer, TimeoutLayer, RequestBodyLimitLayer, SetRequestIdLayer, and many more
tower-governor provides rate limiting based on the governor algorithm
tower-sessions handles server-side sessions
tower-cookies provides cookie management
axum-csrf-sync-pattern implements the OWASP CSRF Synchronizer Token Pattern

In my experience, the Tower ecosystem covers most of what you’ll need out of the box. It’s one of the best reasons to use Axum in the first place. When we get to the Putting It All Together chapter, you’ll see how all these middleware layers come together in a complete application.

Request Validation

If you’ve ever shipped a form without server-side validation because the frontend “already checks it,” you know how that story ends. Client-side validation is a nice UX touch, but it’s not a security boundary. Someone with curl doesn’t care about your JavaScript checks. We always validate on the server.

In my experience, validation in a well-structured app naturally splits into two levels. At the API layer, we check the shape and format of what comes in: are the required fields present? Does the email field actually look like an email? Is the password long enough? Then at the domain layer, we enforce business invariants through the type system, which I covered in the Domain Modeling chapter.

This chapter focuses on that first level, the API-layer validation. We’ll use the validator crate and build a custom Axum extractor to make it seamless.

Validation with the validator crate

The validator crate gives us derive macros to annotate struct fields with validation rules. You call .validate() on an instance, it checks every rule, and hands back a structured error if anything fails. Let’s look at what that looks like in practice.

#![allow(unused)]
fn main() {
use serde::Deserialize;
use validator::Validate;

#[derive(Debug, Deserialize, Validate)]
pub struct CreateUserDto {
    #[validate(length(min = 2, max = 50, message = "name must be between 2 and 50 characters"))]
    pub name: String,

    #[validate(email(message = "must be a valid email address"))]
    pub email: String,

    #[validate(length(min = 8, message = "password must be at least 8 characters"))]
    pub password: String,
}

#[derive(Debug, Deserialize, Validate)]
pub struct UpdateUserDto {
    #[validate(length(min = 2, max = 50, message = "name must be between 2 and 50 characters"))]
    pub name: Option<String>,

    #[validate(length(max = 500, message = "bio cannot exceed 500 characters"))]
    pub bio: Option<String>,

    #[validate(url(message = "must be a valid URL"))]
    pub avatar_url: Option<String>,
}
}

The validator crate comes with a solid set of built-in validations: email, url, length, range, contains, must_match (handy for password confirmation), and more. If you need something that doesn’t fit any of the built-in validators, you can write your own custom validation functions too.

A custom ValidatedJson extractor

Axum’s built-in Json extractor handles deserialization for us, but it doesn’t run any validation. You could call .validate() by hand at the start of every handler, but honestly, that gets old fast and it’s easy to forget in one place. What I’ve found works much better is building a custom extractor that combines deserialization and validation into a single step.

#![allow(unused)]
fn main() {
use axum::{
    extract::{FromRequest, Request},
    Json,
};
use validator::Validate;

pub struct ValidatedJson<T>(pub T);

impl<T, S> FromRequest<S> for ValidatedJson<T>
where
    T: serde::de::DeserializeOwned + Validate,
    S: Send + Sync,
{
    type Rejection = AppError;

    async fn from_request(req: Request, state: &S) -> Result<Self, Self::Rejection> {
        let Json(value) = Json::<T>::from_request(req, state)
            .await
            .map_err(|e| AppError::Validation(format!("invalid JSON: {e}")))?;

        value.validate().map_err(|e| {
            AppError::ValidationFields(format_validation_errors(&e))
        })?;

        Ok(Self(value))
    }
}

fn format_validation_errors(
    errors: &validator::ValidationErrors,
) -> HashMap<String, Vec<String>> {
    errors
        .field_errors()
        .iter()
        .map(|(field, errs)| {
            let messages = errs
                .iter()
                .map(|e| {
                    e.message
                        .as_ref()
                        .map(|m| m.to_string())
                        .unwrap_or_else(|| format!("{} is invalid", field))
                })
                .collect();
            (field.to_string(), messages)
        })
        .collect()
}
}

Now our handlers can accept ValidatedJson<CreateUserDto> instead of Json<CreateUserDto>, and validation just happens before our handler body ever runs. If validation fails, the client gets a 400 response with a clear error message telling them exactly which fields failed and why. No extra work on our part.

#![allow(unused)]
fn main() {
async fn create_user(
    State(state): State<AppState>,
    ValidatedJson(payload): ValidatedJson<CreateUserDto>,
) -> AppResult<(StatusCode, Json<UserResponse>)> {
    // By this point, we know:
    // - The JSON was well-formed
    // - name is between 2 and 50 characters
    // - email is a valid email format
    // - password is at least 8 characters
    //
    // The handler can focus on business logic.
    let user = state.user_service.register(payload).await?;
    Ok((StatusCode::CREATED, Json(user.into())))
}
}

Validating query parameters

We can use the same approach for query parameters. Let’s build a ValidatedQuery extractor that combines Query extraction with validation:

#![allow(unused)]
fn main() {
use axum::extract::Query;

pub struct ValidatedQuery<T>(pub T);

impl<T, S> FromRequestParts<S> for ValidatedQuery<T>
where
    T: serde::de::DeserializeOwned + Validate,
    S: Send + Sync,
{
    type Rejection = AppError;

    async fn from_request_parts(
        parts: &mut http::request::Parts,
        state: &S,
    ) -> Result<Self, Self::Rejection> {
        let Query(value) = Query::<T>::from_request_parts(parts, state)
            .await
            .map_err(|e| AppError::Validation(format!("invalid query parameters: {e}")))?;

        value.validate().map_err(|e| {
            AppError::ValidationFields(format_validation_errors(&e))
        })?;

        Ok(Self(value))
    }
}
}

Structured error responses

If other developers are consuming your API (and they probably are), you’ll want to return validation errors in a structured format they can actually parse, not just a single concatenated string. I like returning errors as a JSON object with per-field error lists:

{
    "error": {
        "type": "validation_error",
        "message": "request validation failed",
        "fields": {
            "email": ["must be a valid email address"],
            "password": ["password must be at least 8 characters"]
        }
    }
}

To make this work, we need to extend our AppError enum to carry structured field errors and adjust the IntoResponse implementation to format them. The exact shape depends on whatever conventions your API follows, but the idea is the same: give the client enough information to fix everything in one shot. Nobody wants to fix one field, resubmit, and then discover the next failure.

Pre-built validation extractors

If writing your own extractor feels like more ceremony than you want, the ecosystem has you covered:

axum-valid integrates with validator, garde, and validify, providing Valid<Json<T>>, Valid<Query<T>>, and Valid<Form<T>> extractors
axum-validated-extractors provides ValidatedJson, ValidatedQuery, and ValidatedForm out of the box

These crates save you the boilerplate we wrote above. That said, I tend to prefer rolling my own because it gives me full control over the error format and behavior, which matters once you have opinions about your API’s error responses (and you will).

Validation at two levels

Before we move on, I want to make the distinction between API-layer and domain-layer validation really clear, because mixing them up leads to either duplicated logic or gaps in your protection. I’ve seen both, and neither is fun to untangle.

API-layer validation (what we covered in this chapter) checks the shape of the input: are required fields present? Are strings the right length? Does the email field actually look like an email? This is about the format of data as it arrives over the wire.

Domain-layer validation (which we cover in the Domain Modeling chapter) enforces business invariants: a username doesn’t contain forbidden characters, an email gets normalized to lowercase, an order amount is positive. This validation happens when you construct domain types like UserName::parse() or Email::parse().

These two levels work together nicely. Our API layer catches obviously malformed input early and returns a helpful error message. Our domain layer makes sure business rules are enforced consistently, whether the data comes from an HTTP request, a message queue, a CSV import, or a test harness. With both in place, we have a solid foundation for the error handling patterns we’ll look at next.

Authentication and Authorization

Every web application eventually needs to answer two questions: “who is making this request?” and “are they allowed to do what they’re asking?” The first is authentication, the second is authorization. If your app manages user data or exposes anything sensitive, you’ll need both.

What I really like about Axum here is the extractor system. We can build custom extractors that validate credentials and check permissions, and then adding auth to a handler is just a matter of putting the right extractor in its argument list. Let’s see how that works.

Choosing an authentication strategy

The right approach depends on who your client is.

First-party browser applications should use server-side sessions with cookies. You store session IDs in HttpOnly, Secure, SameSite=Strict cookies, which means JavaScript can’t access them. This materially reduces the risk of cookie theft through XSS. I should note that HttpOnly doesn’t prevent XSS itself, and an injected script can still act on the user’s behalf or exfiltrate other in-page data. But it does prevent the most common attack vector of stealing the session token directly, which is why OWASP and MDN both recommend it. The server stores session state (user ID, permissions, expiry) in a database or Redis, so sessions can be revoked immediately. The tower-sessions crate provides session middleware for Axum, and axum-login layers identification, authentication, and authorization on top. If you go with cookie-based sessions, you’ll also need CSRF protection as described in the Security chapter.

Service-to-service APIs and third-party integrations are where stateless JWTs shine. The client includes a token in the Authorization: Bearer header, and the server validates it without hitting a database. This is simpler for machine clients that don’t have a browser cookie jar, and it scales horizontally because there’s no server-side session state to share. The tradeoff is that JWTs can’t be revoked before expiry without additional infrastructure (a deny-list or short lifetimes with refresh tokens).

Public APIs with third-party consumers often use OAuth2/OIDC, where an identity provider issues tokens and your API validates them. The jwt-authorizer crate supports OIDC discovery and automatic key rotation for this use case.

For the rest of this chapter, we’ll focus on the JWT approach. It’s the most common pattern for API-style services, and it shows off Axum’s extractor pattern nicely. If you’re building a browser-facing application, start with tower-sessions and axum-login instead.

JWT authentication with a custom extractor

JWTs work well for stateless API authentication. The flow is pretty straightforward: the client authenticates (typically with email and password), gets back a JWT, and then includes it in the Authorization header on every subsequent request. The server validates the token each time without needing to look anything up in a database.

The key to making this work cleanly in Axum is building a custom extractor that implements FromRequestParts. When you put this extractor in a handler’s signature, Axum runs the validation logic automatically before the handler body ever executes. Let’s look at what that looks like.

#![allow(unused)]
fn main() {
use axum::{
    extract::FromRequestParts,
    http::request::Parts,
};
use axum_extra::{
    headers::{Authorization, authorization::Bearer},
    TypedHeader,
};
use jsonwebtoken::{decode, Algorithm, DecodingKey, Validation};

/// Represents an authenticated user. Including this in a handler's
/// signature automatically requires and validates a JWT.
#[derive(Debug, Clone)]
pub struct AuthUser {
    pub user_id: Uuid,
    pub email: String,
    pub role: Role,
}

#[derive(Debug, Clone, Deserialize, Serialize)]
pub struct Claims {
    pub sub: Uuid,       // subject (user ID)
    pub email: String,
    pub role: Role,
    pub iss: String,     // issuer
    pub aud: String,     // audience
    pub exp: usize,      // expiration time
    pub iat: usize,      // issued at
}

impl<S> FromRequestParts<S> for AuthUser
where
    AppState: FromRef<S>,
    S: Send + Sync,
{
    type Rejection = AppError;

    async fn from_request_parts(
        parts: &mut Parts,
        state: &S,
    ) -> Result<Self, Self::Rejection> {
        let app_state = AppState::from_ref(state);

        // Extract the Authorization: Bearer <token> header
        let TypedHeader(Authorization(bearer)) = parts
            .extract::<TypedHeader<Authorization<Bearer>>>()
            .await
            .map_err(|_| AppError::Unauthorized)?;

        // Decode and validate the JWT with explicit validation rules.
        // Validation::default() uses HS256 and checks exp, but you should
        // be explicit about what you accept in production.
        let mut validation = Validation::new(Algorithm::HS256);
        validation.set_issuer(&["myapp"]);
        validation.set_audience(&["myapp-api"]);
        validation.leeway = 30; // 30 seconds of clock skew tolerance

        let token_data = decode::<Claims>(
            bearer.token(),
            &DecodingKey::from_secret(
                app_state.config.jwt_secret.expose_secret().as_bytes()
            ),
            &validation,
        )
        .map_err(|_| AppError::Unauthorized)?;

        Ok(AuthUser {
            user_id: token_data.claims.sub,
            email: token_data.claims.email,
            role: token_data.claims.role,
        })
    }
}
}

Using it in a handler is as simple as adding the parameter to the function signature:

#![allow(unused)]
fn main() {
async fn get_my_profile(
    user: AuthUser,
    State(state): State<AppState>,
) -> AppResult<Json<ProfileResponse>> {
    let profile = state.user_service
        .get_by_id(UserId::from_uuid(user.user_id))
        .await?
        .ok_or(AppError::NotFound)?;

    Ok(Json(profile.into()))
}
}

If the JWT is missing, expired, or invalid, our extractor returns AppError::Unauthorized and the handler body never runs. There’s no explicit auth-checking code in the handler itself, which is exactly the kind of separation I want to see.

Optional authentication with MaybeAuthUser

Some endpoints behave differently depending on whether the caller is logged in, without actually requiring authentication. Think of a feed endpoint that shows personalized results for logged-in users and generic results for anonymous visitors.

You might be tempted to just use Option<AuthUser> here, but that loses an important distinction. None could mean “no Authorization header was sent” (the user is anonymous) or “an Authorization header was sent but the token was garbage” (which should be a 401, not a silent fallback to anonymous behavior). Those are very different situations.

A dedicated MaybeAuthUser extractor lets us tell them apart:

#![allow(unused)]
fn main() {
pub struct MaybeAuthUser(pub Option<AuthUser>);

impl<S> FromRequestParts<S> for MaybeAuthUser
where
    AppState: FromRef<S>,
    S: Send + Sync,
{
    type Rejection = AppError;

    async fn from_request_parts(
        parts: &mut Parts,
        state: &S,
    ) -> Result<Self, Self::Rejection> {
        // If there is no Authorization header at all, that is fine.
        let Some(auth_header) = parts.headers.get(header::AUTHORIZATION) else {
            return Ok(Self(None));
        };

        // But if a header IS present, it must be valid.
        // An invalid token is an error, not anonymous access.
        let token = auth_header
            .to_str()
            .ok()
            .and_then(|v| v.strip_prefix("Bearer "))
            .ok_or(AppError::Unauthorized)?;

        let app_state = AppState::from_ref(state);
        let mut validation = Validation::new(Algorithm::HS256);
        validation.set_issuer(&["myapp"]);
        validation.set_audience(&["myapp-api"]);

        let token_data = decode::<Claims>(
            token,
            &DecodingKey::from_secret(
                app_state.config.jwt_secret.expose_secret().as_bytes()
            ),
            &validation,
        )
        .map_err(|_| AppError::Unauthorized)?;

        Ok(Self(Some(AuthUser {
            user_id: token_data.claims.sub,
            email: token_data.claims.email,
            role: token_data.claims.role,
        })))
    }
}
}

I picked up this pattern from the launchbadge/realworld-axum-sqlx reference implementation, which documents it as a deliberate design choice. It’s a small detail, but it prevents a whole category of subtle auth bugs where a client sends a malformed token and silently gets anonymous behavior instead of a clear error.

Issuing tokens

Now let’s look at the other side of the coin: how we actually create these tokens. The login endpoint validates credentials and issues a JWT:

#![allow(unused)]
fn main() {
use jsonwebtoken::{encode, EncodingKey, Header};

async fn login(
    State(state): State<AppState>,
    Json(credentials): Json<LoginDto>,
) -> AppResult<Json<TokenResponse>> {
    let user = state.user_service
        .authenticate(&credentials.email, &credentials.password)
        .await?
        .ok_or(AppError::Unauthorized)?;

    let now = chrono::Utc::now();
    let claims = Claims {
        sub: *user.id().as_uuid(),
        email: user.email().as_str().to_string(),
        role: user.role().clone(),
        iss: "myapp".to_string(),
        aud: "myapp-api".to_string(),
        iat: now.timestamp() as usize,
        // Short-lived access tokens limit damage if compromised.
        // For longer sessions, implement a refresh token mechanism.
        exp: (now + chrono::TimeDelta::try_minutes(15).expect("valid duration"))
            .timestamp() as usize,
    };

    let token = encode(
        &Header::default(),
        &claims,
        &EncodingKey::from_secret(
            state.config.jwt_secret.expose_secret().as_bytes()
        ),
    )
    .map_err(|e| AppError::Internal(e.into()))?;

    Ok(Json(TokenResponse { token }))
}
}

There are a few production considerations here that are easy to overlook.

Token lifetime. Our example uses 15-minute access tokens. Short lifetimes limit the window of exposure if a token gets stolen. For user-facing applications that need longer sessions, you’ll want a separate refresh token flow where the refresh token lives in a secure, HttpOnly cookie and can be revoked server-side.

Algorithm pinning. Always specify the expected algorithm explicitly (here, Algorithm::HS256) rather than trusting the alg header in the token itself. Accepting the token’s self-declared algorithm is a well-known attack vector, and I’ve seen it bite teams who thought they were being flexible.

Issuer and audience. Setting iss and aud claims, and validating them on decode, prevents tokens issued for one service from being accepted by another. This matters as soon as you have more than one service sharing a secret or using the same identity provider.

Role claims and revocation. Here’s something that catches people off guard: embedding roles in the JWT means that permission changes (revoking admin access, disabling an account) don’t take effect until the token expires and a new one is issued. If immediate revocation matters for your application, you’ll need either very short token lifetimes, a token version check against the database, or a deny-list.

Password hashing

You already know not to store passwords in plain text, but the choice of hashing algorithm matters too. For new projects, use Argon2. It won the Password Hashing Competition in 2015 and it’s still the best option we have. Don’t use SHA-256, MD5, or even bcrypt for new code.

#![allow(unused)]
fn main() {
use argon2::{
    Argon2,
    PasswordHash,
    PasswordHasher,
    PasswordVerifier,
    password_hash::SaltString,
};
use argon2::password_hash::rand_core::OsRng;

pub fn hash_password(password: &str) -> anyhow::Result<String> {
    let salt = SaltString::generate(&mut OsRng);
    // Use Argon2id, which is the variant recommended by OWASP.
    // The default() configuration uses Argon2id with reasonable parameters,
    // but for production you should tune memory cost and iterations based
    // on your hardware and OWASP's current minimum recommendations.
    let hasher = Argon2::default(); // Argon2id with default params
    let hash = hasher
        .hash_password(password.as_bytes(), &salt)
        .map_err(|e| anyhow::anyhow!("failed to hash password: {}", e))?
        .to_string();
    Ok(hash)
}

pub fn verify_password(password: &str, hash: &str) -> anyhow::Result<bool> {
    let parsed_hash = PasswordHash::new(hash)
        .map_err(|e| anyhow::anyhow!("failed to parse password hash: {}", e))?;
    Ok(Argon2::default()
        .verify_password(password.as_bytes(), &parsed_hash)
        .is_ok())
}
}

One thing to keep in mind: password hashing is deliberately slow. That’s the whole point, making brute-force attacks impractical. But in an async context, this means you should run it on a blocking thread so you don’t tie up the async runtime:

#![allow(unused)]
fn main() {
let hash = tokio::task::spawn_blocking(move || hash_password(&password))
    .await
    .context("password hashing task failed")??;
}

Role-based authorization

Once we have authentication working, authorization is the natural next step. The simplest approach is just checking the user’s role inside the handler:

#![allow(unused)]
fn main() {
async fn admin_only_endpoint(
    user: AuthUser,
    State(state): State<AppState>,
) -> AppResult<Json<AdminData>> {
    if user.role != Role::Admin {
        return Err(AppError::Forbidden);
    }
    // ... admin logic
}
}

That works, but if you find yourself repeating the same role check across multiple handlers, we can do better. Let’s build an extractor that bakes the role requirement right in:

#![allow(unused)]
fn main() {
pub struct RequireAdmin(pub AuthUser);

impl<S> FromRequestParts<S> for RequireAdmin
where
    AppState: FromRef<S>,
    S: Send + Sync,
{
    type Rejection = AppError;

    async fn from_request_parts(
        parts: &mut Parts,
        state: &S,
    ) -> Result<Self, Self::Rejection> {
        let user = AuthUser::from_request_parts(parts, state).await?;

        if user.role != Role::Admin {
            return Err(AppError::Forbidden);
        }

        Ok(Self(user))
    }
}

// Usage: just include it in the handler signature
async fn admin_endpoint(RequireAdmin(user): RequireAdmin) -> AppResult<Json<AdminData>> {
    // If we get here, the user is authenticated AND has the admin role
    // ...
}
}

Middleware-based authentication

There’s another way to handle this: instead of putting the extractor on each handler, you can apply authentication as middleware to a whole group of routes. This is handy when you have a block of routes that all require authentication and you don’t want to repeat AuthUser in every handler signature.

#![allow(unused)]
fn main() {
let public_routes = Router::new()
    .route("/health", get(health_check))
    .route("/api/v1/auth/login", post(login))
    .route("/api/v1/auth/register", post(register));

let protected_routes = Router::new()
    .nest("/api/v1/users", user_routes())
    .nest("/api/v1/posts", post_routes())
    .route_layer(middleware::from_fn_with_state(
        state.clone(),
        auth_middleware,
    ));

let app = Router::new()
    .merge(public_routes)
    .merge(protected_routes)
    .with_state(state);
}

Which approach should you pick? In my experience, extractors are more flexible because individual handlers can opt in or out, and the handler gets direct access to the AuthUser value. Middleware is more convenient when entire route groups all need the same authentication requirement. You can also combine both, using middleware for the baseline and extractors for finer-grained checks within that group.

Ecosystem crates

Before we wrap up, it’s worth knowing what the ecosystem offers. You don’t always have to roll your own:

axum-login provides session-based authentication with pluggable backends
axum-gate combines JWT validation with role-based authorization
axum-session handles database-persisted sessions
jwt-authorizer provides JWT validation with OIDC discovery support
axum-csrf-sync-pattern implements CSRF protection for session-based authentication

For most API-style applications, I think the custom AuthUser extractor approach we built above gives you the right balance of simplicity and control. But when you need more complex features like session management, OAuth2 flows, or OIDC integration, these crates can save you a lot of work.

API Design

If you’ve ever integrated with an API where every endpoint returned data in a slightly different shape, you know how frustrating that gets. You spend more time reading docs (or guessing) than actually building. I want us to avoid inflicting that on anyone, including our future selves. In this chapter, we’ll walk through the REST conventions, response patterns, pagination strategies, and documentation approaches that make our Axum API predictable and pleasant to work with.

REST conventions

REST isn’t a formal specification with strict rules. It’s more a set of conventions that most API consumers have come to expect. Let’s go through the ones that matter most.

Use plural nouns for resources. /api/v1/users, not /api/v1/user. The resource name represents a collection, and individual items within that collection get accessed by their ID.

Use HTTP methods to express the operation. GET retrieves data, POST creates new resources, PUT replaces a resource entirely, PATCH applies a partial update, and DELETE removes a resource. This might seem obvious, but I’ve seen plenty of APIs that use POST for everything.

Use appropriate status codes. Honestly, this is one of the highest-leverage things you can do for API usability. When a client creates a resource, return 201 Created, not 200 OK. When a delete succeeds, return 204 No Content. When the client sends invalid input, return 400 Bad Request with details about what went wrong. When the requested resource doesn’t exist, return 404 Not Found.

Here are the status codes you’ll reach for most often:

Code	Meaning	When to use
200	OK	Successful read or update
201	Created	A new resource was created
204	No Content	Successful operation with no response body (delete)
400	Bad Request	Invalid input or validation failure
401	Unauthorized	Missing or invalid authentication
403	Forbidden	Authenticated but not authorized
404	Not Found	Resource does not exist
409	Conflict	Business rule violation (duplicate email, etc.)
422	Unprocessable Entity	Semantically invalid request
500	Internal Server Error	Unexpected server failure

API versioning

If you’re building a public API or platform service, version from the start. Adding versioning later means either breaking changes or awkward workarounds, and the cost of stamping /v1 on your routes from day one is basically nothing.

For internal services that ship in lockstep with their consumers, path versioning is less critical. In my experience, disciplined schema evolution (additive changes, deprecation windows, contract tests) often matters more than a version prefix. Don’t version just because a guide told you to. Version because your consumers need stability guarantees that you can’t provide through coordination alone.

The simplest and most widely used approach is path-based versioning:

#![allow(unused)]
fn main() {
fn api_routes() -> Router<AppState> {
    Router::new()
        .nest("/api/v1", v1_routes())
}

fn v1_routes() -> Router<AppState> {
    Router::new()
        .nest("/users", user_routes())
        .nest("/posts", post_routes())
}
}

When you eventually need a v2 of a particular endpoint, you just add it alongside v1 without disturbing existing consumers:

#![allow(unused)]
fn main() {
fn api_routes() -> Router<AppState> {
    Router::new()
        .nest("/api/v1", v1_routes())
        .nest("/api/v2", v2_routes())
}
}

One thing worth keeping in mind: try to keep your handler implementations decoupled from the version prefix so that v1 and v2 routes can share the same underlying service logic where the behavior hasn’t changed.

Consistent response shapes

You might wonder why we’d bother wrapping every response in a standard type. The reason is simple: when every endpoint returns data in a predictable structure, clients can write generic parsing logic instead of special-casing each endpoint. Let’s define a few wrapper types.

#![allow(unused)]
fn main() {
#[derive(Serialize)]
pub struct ApiResponse<T: Serialize> {
    pub data: T,
}

#[derive(Serialize)]
pub struct PaginatedResponse<T: Serialize> {
    pub data: Vec<T>,
    pub meta: PaginationMeta,
}

#[derive(Serialize)]
pub struct PaginationMeta {
    pub page: u32,
    pub per_page: u32,
    pub total: u64,
    pub total_pages: u32,
}
}

For error responses, we’ll use the format described in the Error Handling chapter. That way, clients can always check for an error field to determine whether the request succeeded.

Pagination

Any endpoint that returns a list of resources should support pagination. Without it, you’re one large dataset away from timeouts, out-of-memory errors, and unhappy consumers. Trust me, it’s much easier to add pagination now than to bolt it on later when your users table has grown to a few hundred thousand rows.

Offset-based pagination is the simplest approach and works well when the dataset isn’t enormous and records aren’t being inserted or deleted frequently during pagination:

#![allow(unused)]
fn main() {
#[derive(Debug, Deserialize)]
pub struct PaginationParams {
    #[serde(default = "default_page")]
    pub page: u32,
    #[serde(default = "default_per_page")]
    pub per_page: u32,
}

fn default_page() -> u32 { 1 }
fn default_per_page() -> u32 { 20 }

async fn list_users(
    State(state): State<AppState>,
    Query(params): Query<PaginationParams>,
) -> AppResult<Json<PaginatedResponse<UserResponse>>> {
    let per_page = params.per_page.clamp(1, 100); // at least 1, at most 100
    let offset = (params.page.saturating_sub(1)) * per_page;

    let (users, total) = state.user_service
        .list(offset, per_page)
        .await?;

    Ok(Json(PaginatedResponse {
        data: users.into_iter().map(Into::into).collect(),
        meta: PaginationMeta {
            page: params.page,
            per_page,
            total,
            total_pages: total.div_ceil(per_page as u64) as u32,
        },
    }))
}
}

Cursor-based pagination is better for large or frequently-changing datasets. Instead of an offset, the client passes a cursor (typically the ID or timestamp of the last item they received), and the server returns the next page starting after that cursor. This avoids the “skipping rows” problem, where offset-based pagination can miss or duplicate records when data changes between pages.

If you don’t want to implement cursor logic yourself, the paginator-axum crate provides cursor-based pagination with metadata including next_cursor and prev_cursor fields. It’s a solid starting point.

Filtering and sorting

For list endpoints that need filtering, we accept filter parameters as query strings. This is pretty straightforward:

#![allow(unused)]
fn main() {
#[derive(Debug, Deserialize)]
pub struct UserListParams {
    #[serde(default = "default_page")]
    pub page: u32,
    #[serde(default = "default_per_page")]
    pub per_page: u32,
    pub role: Option<Role>,
    pub search: Option<String>,
    #[serde(default = "default_sort")]
    pub sort_by: String,
    #[serde(default = "default_sort_direction")]
    pub sort_direction: SortDirection,
}
}

One thing to watch out for: be mindful about which fields you allow sorting on, and make sure those fields are indexed in your database. Letting users sort on an unindexed column is a recipe for slow queries that’ll bite you in production.

OpenAPI documentation

We’ve all dealt with API docs that were accurate when someone wrote them six months ago and have slowly drifted out of date since. What I’ve found works much better is generating documentation directly from the code. The utoipa crate does exactly this, producing OpenAPI specifications from your Rust types and handler annotations. Since the docs come from the actual code, they can’t go stale.

#![allow(unused)]
fn main() {
use utoipa::{OpenApi, ToSchema};

#[derive(Serialize, ToSchema)]
pub struct UserResponse {
    pub id: Uuid,
    pub name: String,
    pub email: String,
    pub created_at: DateTime<Utc>,
}

#[utoipa::path(
    post,
    path = "/api/v1/users",
    request_body = CreateUserDto,
    responses(
        (status = 201, description = "User created successfully", body = UserResponse),
        (status = 400, description = "Validation error", body = ErrorResponse),
        (status = 409, description = "User with this email already exists", body = ErrorResponse),
    ),
    tag = "users"
)]
async fn create_user(
    State(state): State<AppState>,
    ValidatedJson(payload): ValidatedJson<CreateUserDto>,
) -> AppResult<(StatusCode, Json<UserResponse>)> {
    // ...
}
}

To serve the Swagger UI alongside our API, we wire it up like this:

#![allow(unused)]
fn main() {
use utoipa::OpenApi;
use utoipa_swagger_ui::SwaggerUi;

#[derive(OpenApi)]
#[openapi(
    paths(create_user, get_user, list_users, update_user, delete_user),
    components(schemas(UserResponse, CreateUserDto, UpdateUserDto, ErrorResponse)),
    tags((name = "users", description = "User management endpoints"))
)]
struct ApiDoc;

let app = Router::new()
    .merge(api_routes())
    .merge(SwaggerUi::new("/swagger-ui").url("/api-docs/openapi.json", ApiDoc::openapi()))
    .with_state(state);
}

Now developers can browse our API documentation at /swagger-ui and try out requests directly from the browser. The specification is generated at compile time from the actual types and handlers, so it literally can’t fall out of sync with the implementation. That’s a nice property to have.

Health check endpoints

Every production API needs health check endpoints. Your load balancer and container orchestrator need a way to know whether your service is alive and ready to accept traffic. Let’s look at how we set these up.

#![allow(unused)]
fn main() {
/// Liveness probe: is the process running and able to handle requests?
async fn health_live() -> StatusCode {
    StatusCode::OK
}

/// Readiness probe: is the application ready to serve traffic?
/// Checks that all dependencies (database, cache, etc.) are reachable.
async fn health_ready(State(state): State<AppState>) -> StatusCode {
    match sqlx::query("SELECT 1").execute(&state.db).await {
        Ok(_) => StatusCode::OK,
        Err(_) => StatusCode::SERVICE_UNAVAILABLE,
    }
}
}

We mount these outside our versioned API routes so they remain stable even when the API evolves:

#![allow(unused)]
fn main() {
let app = Router::new()
    .route("/health", get(health_live))
    .route("/health/ready", get(health_ready))
    .merge(api_routes())  // api_routes() already nests under /api/v1
    .with_state(state);
}

The liveness probe should be fast and unconditional. All it tells the orchestrator is “yes, the process is alive.” The readiness probe is the one that checks whether our dependencies (database, cache, whatever else) are healthy, so the orchestrator knows it’s safe to route traffic to this instance. Getting these two confused is a common mistake, and it can lead to your orchestrator restarting healthy containers just because the database had a brief hiccup.

With our API designed this way, we have a solid foundation: consistent URLs, predictable response shapes, pagination that won’t fall over at scale, docs that stay accurate, and health checks that keep our infrastructure informed. Next, let’s look at how we handle the errors that inevitably come up when all of this is running in production.

Observability

You’ve probably had that moment where something breaks in production and you’re staring at a wall of unhelpful log lines, trying to piece together what happened. I certainly have. Observability is how we avoid that situation. It’s how we know what our application is actually doing once it’s running out in the world, without having to reproduce the problem on our laptop.

The Rust ecosystem has converged on the tracing crate as the standard for structured, contextual logging. When you combine it with OpenTelemetry for distributed tracing and metrics, you get a solid observability stack that covers most of what we’ll need.

Structured logging with tracing

If you’ve used log or env_logger before, tracing will feel familiar at first, but it’s doing something quite different under the hood. Instead of emitting flat text strings, it emits structured events with typed fields. That means your log analysis tool can actually filter, aggregate, and query on specific fields rather than trying to parse them out of a message string. It also supports spans, which represent a period of time (like the duration of an HTTP request or a database query) and carry context that gets automatically attached to all events within them.

Setting up the subscriber

The tracing subscriber controls how events are formatted and where they go. In development, we want human-readable output with colors so we can actually read what’s happening in our terminal. In production, we want JSON output that our log aggregation system can parse.

#![allow(unused)]
fn main() {
use tracing_subscriber::{
    fmt, layer::SubscriberExt, util::SubscriberInitExt, EnvFilter,
};

pub fn init_tracing(log_level: &str) {
    let env_filter = EnvFilter::try_from_default_env()
        .unwrap_or_else(|_| EnvFilter::new(log_level));

    // Use runtime configuration rather than build profile to control
    // log format. A release build in staging should still be readable,
    // and a debug build against production data should still emit JSON.
    let json_output = std::env::var("LOG_JSON")
        .map(|v| v == "true" || v == "1")
        .unwrap_or(false);

    let fmt_layer = if json_output {
        fmt::layer().json().boxed()
    } else {
        fmt::layer().pretty().boxed()
    };

    tracing_subscriber::registry()
        .with(env_filter)
        .with(fmt_layer)
        .init();
}
}

The EnvFilter lets us control log levels at runtime through the RUST_LOG environment variable. You can set different levels for different modules, which is really helpful when you need to turn up verbosity on one specific component without drowning in logs from everything else:

RUST_LOG=info,my_app::infra=debug,sqlx=warn

Request tracing with TraceLayer

The TraceLayer from tower-http automatically creates a span for each HTTP request, recording the method, URI, status code, and duration. When we combine this with a request ID, we can trace a single request through all the log entries it generates. Let’s see how that looks.

#![allow(unused)]
fn main() {
use tower_http::trace::TraceLayer;

let trace_layer = TraceLayer::new_for_http()
    .make_span_with(|request: &http::Request<_>| {
        let request_id = request
            .headers()
            .get("x-request-id")
            .and_then(|v| v.to_str().ok())
            .unwrap_or("unknown");

        // Prefer the matched route pattern (e.g., "/api/v1/users/{id}")
        // over the raw URI (e.g., "/api/v1/users/550e8400-...").
        // Raw URIs create high-cardinality span fields that can overwhelm
        // metrics backends and make aggregation impossible.
        let matched_path = request.extensions()
            .get::<axum::extract::MatchedPath>()
            .map(|p| p.as_str().to_string())
            .unwrap_or_else(|| request.uri().path().to_string());

        tracing::info_span!(
            "http_request",
            method = %request.method(),
            path = %matched_path,
            request_id = %request_id,
        )
    });
}

Every log event emitted while processing that request will automatically include the method, path, and request_id fields, because the span acts as ambient context for all work done within it. This is one of those things that sounds small but makes a huge difference. When our application is handling hundreds of concurrent requests, we can still find every log entry related to one specific request just by filtering on its ID.

Instrumenting handlers and services

We can use the #[tracing::instrument] attribute to automatically create spans for individual functions. I find this particularly valuable on service methods and repository calls, because it lets you see exactly where time is being spent without manually wiring up span creation everywhere.

#![allow(unused)]
fn main() {
#[tracing::instrument(
    skip(state, payload),
)]
async fn create_user(
    State(state): State<AppState>,
    ValidatedJson(payload): ValidatedJson<CreateUserDto>,
) -> AppResult<(StatusCode, Json<UserResponse>)> {
    tracing::info!("creating new user");

    let user = state.user_service.register(payload).await?;

    tracing::info!(user_id = %user.id().as_uuid(), "user created successfully");
    Ok((StatusCode::CREATED, Json(user.into())))
}
}

The skip(state, payload) directive tells the instrument macro not to include the state or the request payload in the span’s fields. After the service call succeeds, we log the user_id as a structured field. Notice that we use the opaque user_id rather than user.email in the span. This is intentional. In my experience, it’s much better to stick with low-PII identifiers (user IDs, tenant IDs, request IDs) over personal data (email addresses, names) in your tracing output. If you need to correlate a trace with a real email for debugging, look it up from the user ID. You don’t want every log line in your aggregation system carrying personal data around.

Logging best practices

Use structured fields instead of string interpolation. Rather than tracing::info!("Created user {}", user_id), write tracing::info!(user_id = %user_id, "user created"). The structured form lets your log aggregation system index and filter on the user_id field, which is way more useful than trying to parse it out of a text string.

Log at appropriate levels. This might seem obvious, but it’s worth spelling out. Use error for things that indicate a bug or a system failure that needs attention. Use warn for situations that are unusual but handled (like a rate-limited request or a cache miss). Use info for significant business events (user created, order placed). Use debug for detailed operational information that helps during development or troubleshooting. Getting this right matters because when something goes wrong at 2 AM, you want your alerts to mean something.

Never log secrets. If you’re using the secrecy crate for your configuration (as we covered in the Configuration chapter), the Debug implementation will automatically redact sensitive values. But be careful with request bodies, headers, and other data that might contain tokens, passwords, or personal information.

OpenTelemetry

Once our application grows into a distributed system where a single user request might touch multiple services, we need a way to follow that request across boundaries. That’s where OpenTelemetry comes in. It gives us a standard way to propagate trace context and collect telemetry data.

The key crates are:

opentelemetry and opentelemetry_sdk for the core API and SDK
opentelemetry-otlp for exporting traces to an OTLP-compatible backend (Jaeger, Tempo, Datadog, etc.)
tracing-opentelemetry for bridging the tracing crate’s spans to OpenTelemetry spans
axum-tracing-opentelemetry for automatic trace context propagation in Axum

With this stack, each request gets a trace ID that follows it across service boundaries. When a user reports a problem, we can look up the trace by ID and see the whole picture: which services were involved, how long each step took, and where things went wrong.

Metrics

Metrics give us a different lens than logs and traces. Where a log entry tells us about one specific request, metrics give us aggregate, time-series data about how our application is behaving overall. They answer questions like: is our 99th percentile latency creeping up? Has the error rate doubled in the last five minutes? Is the database connection pool running hot?

axum-otel-metrics provides OpenTelemetry metrics with Prometheus export
axum-prometheus provides HTTP metrics compatible with the metrics.rs ecosystem

A typical setup exposes a /metrics endpoint that Prometheus scrapes at regular intervals. In my experience, the most useful metrics for a web application are request count (by method, path, and status code), request duration histograms, and error rates. You might wonder if you need all three from day one. Honestly, even just request duration and error rates will catch most problems before your users notice them.

Bringing it together

A production-ready observability setup combines all three of these. Structured logs give us detailed information about individual events. Traces let us follow a request through multiple services and find bottlenecks. And metrics give us the big picture of system health and performance trends.

Setting all this up might seem like a lot of work upfront, and it is some work. But it pays for itself the first time you need to debug a production issue. Instead of guessing what went wrong and deploying speculative fixes, you look at the data and see what actually happened. That’s a much better place to be.

With our observability foundation in place, let’s look at how we handle errors and communicate them clearly to our API consumers.

Security

If there’s one thing I want you to take away from this book, it’s that security isn’t something you bolt on at the end. It’s a set of practices that live in every layer of your application, from how you handle user input to how you configure your deployment. We’ve touched on security throughout the earlier chapters, so this chapter pulls all of those threads together into a single reference. I’ve also included a few topics here that didn’t fit neatly anywhere else.

Input handling

Our first line of defense is simple: treat all external input as untrusted. You might think that’s obvious, but it’s easy to let your guard down when you control both the client and the server. Don’t.

Parameterized queries only. Never construct SQL by concatenating strings. Every database query should use parameter placeholders ($1, $2, etc.) with bound values. SQLx’s query! and query_as! macros enforce this at compile time, which is honestly one of the strongest reasons to use them.

Validate at the API boundary. We covered this in detail in the Request Validation chapter. The idea is to reject malformed input before it ever reaches your business logic. Check string lengths, format constraints, and required fields right at the edge.

Parse into domain types. If you’ve read the Domain Modeling chapter, you already know this one. Convert raw input into typed domain objects as early as possible. A UserName that’s been through its parse constructor simply can’t contain forbidden characters, by construction. The type system does the heavy lifting for us.

Limit request body sizes. Use RequestBodyLimitLayer to reject oversized payloads before they eat up memory. A 1 MB limit is reasonable for most JSON APIs.

Authentication and secrets

Hash passwords with Argon2. We covered this in the Authentication chapter, but it’s worth repeating: Argon2 is the current best practice for password hashing. Always use a random salt for each password.

Use long, random JWT secrets. Your JWT secret should be at least 48 characters of random data. Shorter secrets are vulnerable to brute-force attacks, and you really don’t want to find that out the hard way. Store the secret in SecretString from the secrecy crate and validate its length at startup.

Set reasonable token expiration times. Short-lived access tokens (15 minutes to a few hours) limit the damage if a token gets compromised. If you need longer sessions, implement a refresh token mechanism where the refresh token is stored securely and can be revoked.

Never store tokens in localStorage. For browser-based clients, use HttpOnly, Secure, and SameSite=Strict cookies. These aren’t accessible to JavaScript, which protects against XSS-based token theft.

CORS

You’ve probably run into CORS errors during development. Cross-Origin Resource Sharing is a browser security mechanism that controls which origins can make requests to your API. A misconfigured CORS policy is one of the most common security issues I see in web APIs, partly because the quick fix during development (“just make it permissive”) has a habit of sneaking into production.

#![allow(unused)]
fn main() {
// Production: explicit allow list
CorsLayer::new()
    .allow_origin([
        "https://myapp.com".parse().unwrap(),
    ])
    .allow_methods([Method::GET, Method::POST, Method::PUT, Method::DELETE])
    .allow_headers([header::CONTENT_TYPE, header::AUTHORIZATION])
    .allow_credentials(true)
}

In development, CorsLayer::permissive() is convenient, but make sure it never reaches production. A permissive CORS policy allows any origin to make requests to your API. Whether those requests carry credentials (cookies, authorization headers) depends on the browser’s credentials mode and your allow_credentials setting, but even without credentials, a permissive policy exposes your API surface to cross-origin probing. CorsLayer::very_permissive() is even more dangerous because it also enables credentials. Always restrict origins in production.

CSRF protection

If your API uses cookies for authentication (as opposed to Bearer tokens), you need to worry about Cross-Site Request Forgery attacks. The axum-csrf-sync-pattern crate implements the OWASP-recommended Synchronizer Token Pattern, and it works like this:

The server generates a random CSRF token and sends it to the client in a custom response header
The client includes that token in a custom request header on every state-changing request
The server validates that the token in the request header matches the one it issued

Why does this work? A malicious website can cause the browser to send cookies automatically, but it can’t read or set custom headers on cross-origin requests (assuming your CORS policy doesn’t expose them).

Rate limiting

Rate limiting is one of those things that feels optional until someone starts hammering your login endpoint. It protects against brute-force attacks, credential stuffing, and denial-of-service attempts. The tower-governor crate gives us a Tower middleware that limits requests based on client IP address:

#![allow(unused)]
fn main() {
use tower_governor::{GovernorConfigBuilder, GovernorLayer};

let governor_config = GovernorConfigBuilder::default()
    .per_second(2)
    .burst_size(10)
    .finish()
    .unwrap();

let app = Router::new()
    .merge(api_routes())
    .layer(GovernorLayer { config: governor_config })
    .with_state(state);
}

For authentication endpoints (login, password reset), I’d recommend applying stricter limits. Something like 5 attempts per minute per IP on the login endpoint makes credential stuffing attacks impractical.

One thing to keep in mind: if you’re running multiple application instances in production, IP-based rate limiting at the application level isn’t sufficient on its own, since each instance maintains its own counters. You’ll want to look at a Redis-backed rate limiter or handle rate limiting at the load balancer or API gateway level instead.

Security headers

HTTP security headers tell browsers to enable various protective mechanisms. Which headers you need depends on whether your application serves HTML or is a pure JSON API. For an HTML-serving application, headers like Content-Security-Policy and X-Frame-Options are essential. For a JSON-only API, some of these matter less because there’s no browser rendering context to protect, but I’d still recommend setting them defensively. It costs nothing and you won’t regret it.

Let’s apply them as a middleware layer:

#![allow(unused)]
fn main() {
use axum::http::{HeaderName, HeaderValue};

async fn security_headers(req: Request, next: Next) -> Response {
    let mut response = next.run(req).await;
    let headers = response.headers_mut();

    // Prevent MIME-type sniffing
    headers.insert(
        HeaderName::from_static("x-content-type-options"),
        HeaderValue::from_static("nosniff"),
    );

    // Control framing (prefer CSP frame-ancestors for modern browsers,
    // but X-Frame-Options is still useful for older ones)
    headers.insert(
        HeaderName::from_static("x-frame-options"),
        HeaderValue::from_static("DENY"),
    );

    // Content Security Policy: the primary defense against XSS.
    // Adjust the directives to match your application's needs.
    headers.insert(
        HeaderName::from_static("content-security-policy"),
        HeaderValue::from_static("default-src 'none'; frame-ancestors 'none'"),
    );

    // Enforce HTTPS connections
    headers.insert(
        HeaderName::from_static("strict-transport-security"),
        HeaderValue::from_static("max-age=63072000; includeSubDomains"),
    );

    headers.insert(
        HeaderName::from_static("referrer-policy"),
        HeaderValue::from_static("strict-origin-when-cross-origin"),
    );

    response
}
}

Let me walk through what’s included here and what I’ve deliberately left out.

Content-Security-Policy (CSP) is the modern, first-class defense against cross-site scripting. A strict CSP is far more effective than the legacy X-XSS-Protection header, which is deprecated by OWASP and MDN. Don’t set X-XSS-Protection. In some edge cases it can actually introduce vulnerabilities. Use CSP instead.

Strict-Transport-Security (HSTS) tells browsers to only connect over HTTPS. This prevents protocol downgrade attacks and is essential for any production deployment behind TLS.

X-Frame-Options provides clickjacking protection for older browsers. For modern browsers, the frame-ancestors directive in CSP is the preferred mechanism, but both can coexist without issues.

These headers aren’t a substitute for writing secure code, but they give us defense-in-depth by enabling browser-side protections against common attack vectors. Think of them as a safety net.

Dependency auditing

Third-party dependencies can introduce vulnerabilities, and in a Rust project you’ll have more of them than you might expect. Run cargo audit regularly (and ideally in your CI pipeline) to check for known vulnerabilities in your dependency tree:

cargo install cargo-audit
cargo audit

This checks your Cargo.lock against the RustSec Advisory Database and reports any dependencies with known security issues. In my experience, keeping your dependencies up to date is one of the simplest and most effective security practices you can adopt.

Security checklist

Here’s a condensed reference of the security practices we’ve covered throughout this book. I’d recommend going through this list before you deploy anything to production:

All SQL uses parameterized queries (no string interpolation)
Request body size is limited via RequestBodyLimitLayer
Request timeouts are configured via TimeoutLayer
Input is validated at the API boundary
Domain types enforce invariants through the type system
Passwords are hashed with Argon2
JWT secrets are at least 48 random characters
JWT secrets are stored in SecretString and never logged
Token expiration times are set and enforced
CORS is restricted to specific origins in production
CSRF protection is enabled if using cookie-based authentication
Rate limiting is applied to all public endpoints
Stricter rate limits are applied to authentication endpoints
CSP, HSTS, X-Content-Type-Options, and Referrer-Policy headers are set
X-XSS-Protection is not set (use CSP instead)
CORS is driven by runtime configuration, not build profile
Internal error details are never exposed in API responses
JWT validation specifies algorithm, issuer, and audience explicitly
UUIDs are used for resource IDs (makes guessing harder, but authorization does the real work)
cargo audit is run regularly or in CI
cargo sqlx prepare --check is run in CI
.env files are in .gitignore and never committed

Testing

If you’ve ever shipped a web service and then spent the next week anxiously watching logs, you know that feeling. Testing is how we sleep at night. But testing a web application isn’t just one thing. We need unit tests that verify our business logic in isolation, integration tests that exercise the full request-to-response pipeline, and a solid strategy for managing test databases and external dependencies. In this chapter, I’ll walk through the patterns and tools that make testing an Axum application practical and, honestly, kind of enjoyable.

The testing pyramid

You’ve probably heard of the testing pyramid before. It’s a simple idea, but it holds up well in practice. At the base, we write lots of fast unit tests that verify individual functions and domain logic. In the middle, we have integration tests that exercise the full HTTP handler pipeline, including middleware, extractors, serialization, and database access. At the top, we keep a smaller number of end-to-end tests that verify critical user flows against a running server.

In my experience, most of your testing effort should go into the bottom two tiers. Unit tests catch logic bugs quickly and cheaply. Integration tests verify that all the layers are actually wired together correctly. End-to-end tests are useful for critical paths, but they’re slower and more brittle, so I’d use them sparingly.

Unit testing domain logic

This is where our earlier investment in clean architecture really pays off. Because the domain layer has no dependencies on Axum, SQLx, or any other framework, testing it is delightfully simple. We just construct domain types, call service methods, and assert on the results. No HTTP server, no database, no fuss.

If your services use trait-based repositories (as we set up in the Architectural Patterns chapter), you can provide in-memory implementations for testing:

#![allow(unused)]
fn main() {
#[derive(Clone)]
struct InMemoryUserRepo {
    users: Arc<Mutex<Vec<User>>>,
}

impl InMemoryUserRepo {
    fn new() -> Self {
        Self {
            users: Arc::new(Mutex::new(Vec::new())),
        }
    }
}

impl UserRepository for InMemoryUserRepo {
    async fn create(&self, req: &CreateUserRequest) -> Result<User, CreateUserError> {
        let mut users = self.users.lock().await;

        // Check for duplicates
        if users.iter().any(|u| u.email() == &req.email) {
            return Err(CreateUserError::Duplicate { email: req.email.clone() });
        }

        let user = User::hydrate(
            UserId::new(),
            req.name.clone(),
            req.email.clone(),
            Utc::now(),
            Utc::now(),
        );

        users.push(user.clone());
        Ok(user)
    }

    async fn find_by_id(&self, id: UserId) -> anyhow::Result<Option<User>> {
        let users = self.users.lock().await;
        Ok(users.iter().find(|u| u.id() == &id).cloned())
    }

    // ... other methods
}

#[tokio::test]
async fn registering_a_user_returns_the_user() {
    let repo = InMemoryUserRepo::new();
    let service = UserService::new(repo);

    let name = UserName::parse("Alice").unwrap();
    let email = Email::parse("alice@example.com").unwrap();

    let user = service.register(name, email, "password123").await.unwrap();

    assert_eq!(user.name().as_str(), "Alice");
    assert_eq!(user.email().as_str(), "alice@example.com");
}

#[tokio::test]
async fn registering_duplicate_email_fails() {
    let repo = InMemoryUserRepo::new();
    let service = UserService::new(repo);

    let name = UserName::parse("Alice").unwrap();
    let email = Email::parse("alice@example.com").unwrap();

    service.register(name.clone(), email.clone(), "password123").await.unwrap();

    let result = service.register(name, email, "password456").await;
    assert!(matches!(result, Err(CreateUserError::Duplicate { .. })));
}
}

These tests run in milliseconds because they don’t touch the network or the filesystem. They verify that your business logic is correct, independent of how it’s invoked or where the data lives. That speed matters. When tests are fast, you actually run them.

If writing those in-memory implementations feels like a lot of boilerplate (and it can be), the mockall crate can generate mock implementations of your repository traits automatically. I tend to prefer hand-written fakes for core domain tests because they’re easier to reason about, but mockall is a perfectly valid choice, especially as your trait surface grows.

Integration testing with oneshot

Here’s where things get really nice. Axum’s integration with Tower means we can test our entire HTTP pipeline without ever starting a real server. The tower::ServiceExt::oneshot method sends a single request through the router and returns the response. It exercises all your middleware, extractors, serialization, and error handling, just like a real request would.

#![allow(unused)]
fn main() {
use axum::body::Body;
use http::Request;
use tower::ServiceExt;
use http_body_util::BodyExt;

/// Build the application the same way production does, but with a test database.
async fn test_app() -> Router {
    let pool = create_test_pool().await;
    create_app_with_pool(pool).await
}

#[tokio::test]
async fn health_check_returns_200() {
    let app = test_app().await;

    let response = app
        .oneshot(
            Request::builder()
                .uri("/health")
                .body(Body::empty())
                .unwrap(),
        )
        .await
        .unwrap();

    assert_eq!(response.status(), StatusCode::OK);
}

#[tokio::test]
async fn creating_a_user_returns_201() {
    let app = test_app().await;

    let response = app
        .oneshot(
            Request::builder()
                .method("POST")
                .uri("/api/v1/users")
                .header("Content-Type", "application/json")
                .body(Body::from(serde_json::to_string(&serde_json::json!({
                    "name": "Alice",
                    "email": "alice@example.com",
                    "password": "securepassword"
                })).unwrap()))
                .unwrap(),
        )
        .await
        .unwrap();

    assert_eq!(response.status(), StatusCode::CREATED);

    let body = response.into_body().collect().await.unwrap().to_bytes();
    let user: serde_json::Value = serde_json::from_slice(&body).unwrap();
    assert_eq!(user["data"]["name"], "Alice");
    assert_eq!(user["data"]["email"], "alice@example.com");
}

#[tokio::test]
async fn creating_a_user_with_invalid_email_returns_400() {
    let app = test_app().await;

    let response = app
        .oneshot(
            Request::builder()
                .method("POST")
                .uri("/api/v1/users")
                .header("Content-Type", "application/json")
                .body(Body::from(r#"{"name":"Alice","email":"not-an-email","password":"securepassword"}"#))
                .unwrap(),
        )
        .await
        .unwrap();

    assert_eq!(response.status(), StatusCode::BAD_REQUEST);
}
}

The important thing here is that test_app() builds the router using the same create_app function (or something very close to it) that production uses. That means our integration tests exercise the same middleware stack, the same extractors, and the same error handling as production. If something breaks in how those pieces fit together, we’ll catch it here.

Now, let’s be honest about what these in-process tests don’t cover. Because oneshot sends a request directly through the router without going through real sockets, we’re not testing TLS termination, reverse proxy behavior, load balancer health checks, or anything else about your deployment topology. These tests are excellent for verifying application logic, but they’re not a substitute for a smoke test against an actual running server in staging.

Test database management

Integration tests that hit a real database need a strategy for isolation. You really don’t want one test’s leftover data interfering with another test’s expectations. I’ve been bitten by this more times than I’d like to admit.

Per-test databases with SQLx. This is my go-to approach. SQLx provides a #[sqlx::test] attribute that creates a fresh, uniquely-named database for each test and runs your migrations against it. When the test finishes, the database is dropped. Perfect isolation, minimal effort:

#![allow(unused)]
fn main() {
#[sqlx::test(migrations = "./migrations")]
async fn test_user_creation(pool: PgPool) {
    let repo = PostgresUserRepo::new(pool);
    let req = CreateUserRequest { /* ... */ };

    let user = repo.create(&req).await.unwrap();
    assert_eq!(user.email().as_str(), "alice@example.com");
}
}

Shared test database with cleanup. An alternative is to use a single test database and clean up between tests, either by truncating tables or by wrapping each test in a transaction that you roll back. This is faster than creating and dropping databases, but it requires more careful management. You might wonder if the speed tradeoff is worth the complexity. In my experience, it usually isn’t for most teams, but if your test suite takes minutes to run, it’s worth considering.

In-memory SQLite for fast tests. If your SQL is compatible with SQLite (and it often is for simple CRUD operations), you can use SQLite in-memory databases for extremely fast test execution. This is particularly useful for repository-level tests where you want to verify query logic without the overhead of PostgreSQL.

Test helpers

As your test suite grows, you’ll start noticing the same setup code showing up everywhere. Building requests, creating test users, parsing response bodies. It might seem tedious at first, but taking the time to extract shared helpers makes a real difference. Here’s a pattern I’ve found works well:

#![allow(unused)]
fn main() {
// tests/common/mod.rs

pub async fn test_app() -> Router {
    let pool = PgPool::connect(&test_database_url()).await.unwrap();
    sqlx::migrate!("./migrations").run(&pool).await.unwrap();
    create_app_with_pool(pool).await
}

pub fn test_database_url() -> String {
    std::env::var("TEST_DATABASE_URL")
        .unwrap_or_else(|_| "postgres://localhost/myapp_test".to_string())
}

pub async fn create_test_user(app: &Router) -> (Uuid, String) {
    // Helper that creates a user and returns (user_id, auth_token)
    // Reduces boilerplate in tests that need an authenticated user
    // ...
}

pub async fn response_json(response: Response) -> serde_json::Value {
    let body = response.into_body().collect().await.unwrap().to_bytes();
    serde_json::from_slice(&body).unwrap()
}
}

What to test

So what should you actually spend your time testing? Let’s focus on what matters most.

Test the happy path for every endpoint. Verify that valid input produces the expected response with the correct status code and body shape. This is the baseline.

Test validation failures. Send invalid input and verify that the API returns appropriate error responses. This catches regressions where validation rules are accidentally removed or weakened, which happens more often than you’d think.

Test authorization. Verify that unauthenticated requests are rejected, that users can’t access other users’ data, and that role restrictions are enforced. Security bugs are the ones that keep you up at night.

Test error mapping. Verify that domain errors (like duplicate email) produce the right HTTP status codes and error messages, not generic 500 responses. We put a lot of work into our error types in the earlier chapters, so let’s make sure they actually reach the client correctly.

Test edge cases in domain logic. Empty strings, very long strings, boundary values, unusual input combinations. Bugs love to hide here. These are best tested at the unit level, where the tests are fast and focused.

One last thing: don’t obsess over code coverage as a metric. A test suite that thoughtfully covers the important behaviors and edge cases is far more valuable than one that chases 90% coverage by testing trivial getters and setters. What I’ve found is that coverage numbers make you feel good, but well-chosen tests are what actually catch bugs before they ship. With our testing strategy in place, let’s move on to getting our application deployed and running in production.

Performance

One of the nice things about choosing Rust and Axum is that you get a really solid performance baseline without doing anything special. Zero-cost abstractions, no garbage collector, and Tokio’s async runtime working together means our straightforward Axum application can handle thousands of concurrent connections on modest hardware. That said, it’s still possible to leave performance on the table if we’re not thoughtful about how we use the tools.

In this chapter, we’ll walk through the practices that keep our application fast, and the pitfalls I’ve seen trip people up.

Do not block the async runtime

If there’s one performance rule you take away from this entire book, let it be this one. Tokio’s runtime uses a thread pool to execute async tasks cooperatively. If one task blocks a thread (doing CPU-intensive work, synchronous I/O, or calling a blocking library function), that thread isn’t available for other tasks until the blocking work finishes. Since Tokio defaults to one worker thread per CPU core, blocking even a single thread can visibly hurt throughput.

The tricky part is that the symptoms are subtle. Our application handles moderate load just fine, but under higher concurrency we start seeing latency spikes and throughput drops. What’s actually happening is that async tasks are queued up, waiting for a thread that’s stuck doing something synchronous.

The fix is to move blocking work onto a dedicated thread pool:

#![allow(unused)]
fn main() {
// CPU-intensive work (password hashing, image processing, etc.)
let hash = tokio::task::spawn_blocking(move || {
    hash_password(&password)
})
.await
.context("password hashing task panicked")??;
}

spawn_blocking runs the closure on Tokio’s blocking thread pool, which is separate from the async worker threads. While it waits for the result, our async task yields, freeing the worker thread to handle other requests.

One thing to keep in mind: spawn_blocking is an escape hatch, not a general compute scheduler. Tokio’s blocking pool defaults to a maximum of 512 threads. If we spawn a blocking task per incoming request during a traffic spike, we can saturate the pool and every subsequent spawn_blocking call just queues up. For CPU-intensive work at high concurrency, I’d recommend bounding the number of concurrent blocking tasks with a tokio::sync::Semaphore, or moving the work to a dedicated thread pool like rayon. Also worth noting that once a blocking task starts, it can’t be aborted. If our application is shutting down, it’ll wait for all blocking tasks to complete before exiting.

Common sources of accidental blocking include:

Password hashing with Argon2 or bcrypt (CPU-intensive by design)
Synchronous file I/O (use tokio::fs instead of std::fs)
DNS resolution in some HTTP clients (use async-native clients like reqwest)
Calling .lock() on a std::sync::Mutex (use tokio::sync::Mutex if the lock is held across await points)

Connection pooling

Creating a new database connection for each request is expensive. We’re talking TCP handshake, TLS negotiation, and authentication, which can easily add up to tens of milliseconds. Connection pooling amortizes that cost by maintaining a set of pre-established connections that our requests borrow and return.

SQLx’s PgPool handles this transparently for us. The important thing is to tune the pool size appropriately:

#![allow(unused)]
fn main() {
PgPoolOptions::new()
    .max_connections(10)
    .min_connections(2)
    .acquire_timeout(Duration::from_secs(3))
}

If the pool is too small, requests will queue up waiting for a connection. Too large, and we waste memory on the database server while risking PostgreSQL’s connection limit. A good starting point is 2 to 4 connections per CPU core on the application server, and then we adjust based on how much time our requests actually spend in the database versus doing other work.

The acquire_timeout is our safety net. If all connections are in use and a new request can’t acquire one within the timeout, it fails immediately rather than hanging indefinitely. That’s much better than having the request sit in a queue for 30 seconds before timing out at the HTTP level.

Response compression

Compressing response bodies with gzip or brotli can significantly reduce the amount of data we send over the network. This is especially true for JSON APIs, where the response body is highly compressible text.

#![allow(unused)]
fn main() {
use tower_http::compression::CompressionLayer;

let app = Router::new()
    .merge(routes)
    .layer(CompressionLayer::new())
    .with_state(state);
}

CompressionLayer automatically negotiates the compression algorithm with the client based on the Accept-Encoding header. For most JSON APIs, gzip compression reduces response sizes by 70 to 90 percent, which translates to noticeably faster responses for clients, especially on slower networks.

Efficient data handling

Avoid unnecessary allocations. In hot paths, prefer borrowing (&str) over owned types (String) where we can. Use iterators instead of collecting into intermediate vectors. These are small optimizations individually, but they compound when we’re handling a lot of traffic.

Use Arc for shared immutable data. Configuration, static assets, and other data that doesn’t change after startup should be wrapped in Arc. That way, cloning our application state is cheap (just incrementing a reference count) rather than expensive (deep-copying the data).

Stream large responses. For endpoints that return large amounts of data, let’s stream the response instead of buffering the entire thing in memory. The axum-streams crate supports streaming JSON arrays, CSV files, and other formats.

Database query optimization

In my experience, the database is almost always the bottleneck in a web application. A few practices make a real difference here:

Add indexes for columns you filter on. If we’re querying users by email, we need to make sure the email column has an index. Without one, the database has to scan every row in the table for each query.

Use EXPLAIN ANALYZE to understand query plans. When a query is slow, run it with EXPLAIN ANALYZE in your database client to see how the database is actually executing it. Look for sequential scans on large tables, which usually point to a missing index.

Fetch only what you need. If a handler only needs the user’s name and email, don’t SELECT * and pull back every column. Selecting specific columns reduces the amount of data transferred and processed.

Use pagination for list endpoints. As we covered in the API Design chapter, always paginate list queries. An unbounded SELECT * FROM users on a table with a million rows will eventually bring our application to its knees.

Benchmarking and profiling

When we need to optimize, let’s measure first. Guessing at performance bottlenecks is unreliable, because the actual bottleneck is often not where you’d expect.

Criterion is the standard benchmarking framework for Rust. It gives us statistically rigorous benchmarks with confidence intervals, so we can tell whether a change actually improved performance or just happened to produce faster results on one run.

Flamegraph visualizes where our application spends its CPU time. I’ve found it really useful for identifying hot spots in production-like workloads.

DHAT profiles memory allocations, which helps us find code that allocates more than it needs to.

tokio-console is a diagnostic tool built specifically for async Rust applications. It shows us which tasks are running, which are blocked, and how the runtime is scheduling work. This is particularly useful for diagnosing the “blocked runtime” issues we talked about at the beginning of this chapter.

What not to optimize

It’s worth remembering that premature optimization is a real cost. Micro-optimizing a handler that takes 2 milliseconds when the database query inside it takes 50 milliseconds? That’s a waste of effort. Let’s focus our optimization work on the actual bottlenecks, which in a web application are almost always database queries, network I/O, and accidental blocking of the async runtime.

Write clear, idiomatic code first. Profile under realistic load. Then optimize the specific things that the profiler tells us are slow. In my experience, this approach reliably produces applications that are both fast and maintainable. And that’s what we’ll need when we look at how to keep our application observable in the next chapter.

Deployment

So we’ve built our web service, tested it, and it runs great on our laptop. Now what? Getting it into production involves a few things we need to get right: building a small container image, shutting down gracefully so we don’t drop requests that are still in flight, and giving our orchestrator health check endpoints so it knows what’s going on with our app. Let’s walk through each of these.

Docker and multi-stage builds

If you compile Rust for the default GNU Linux target (x86_64-unknown-linux-gnu), your binary is dynamically linked against glibc but doesn’t have any other runtime dependencies. You can also compile with the musl target (x86_64-unknown-linux-musl) for a fully static binary. Either way, our production container doesn’t need the Rust toolchain, source code, or any build dependencies beyond the C library. That’s great news, because it means we can use a multi-stage Docker build: compile in one stage, then copy just the binary into a tiny runtime image.

# Build stage
FROM rust:1-slim AS builder

WORKDIR /app

# Cache dependencies by copying just the manifest files first.
# For more robust dependency caching, consider the cargo-chef crate,
# which is purpose-built for this problem in Docker builds.
COPY Cargo.toml Cargo.lock ./
RUN mkdir src && echo "fn main() {}" > src/main.rs
RUN cargo build --release && rm -rf src

# Now copy the actual source and rebuild (only changed files recompile)
COPY . .
RUN touch src/main.rs && cargo build --release

# Runtime stage
FROM debian:bookworm-slim

RUN apt-get update \
    && apt-get install -y --no-install-recommends ca-certificates \
    && rm -rf /var/lib/apt/lists/*

# Run as a non-root user
RUN useradd --create-home appuser
USER appuser

COPY --from=builder /app/target/release/my-app /usr/local/bin/my-app

EXPOSE 3000

CMD ["my-app"]

The dependency caching trick in the build stage is worth understanding. We copy only Cargo.toml and Cargo.lock first and run a build, which lets Docker cache the compiled dependencies as a layer. When we change our application code but not our dependencies (which is what happens most of the time), Docker reuses that cached layer and only recompiles our code. Without this, you’d be recompiling every dependency on every build, and Rust compile times being what they are, that gets painful fast.

You might wonder why we install ca-certificates in the runtime image. It’s because our application probably makes outbound HTTPS requests (to external APIs, payment providers, and so on) and needs the CA certificate bundle to validate TLS connections. If you skip this, you’ll get cryptic TLS errors at runtime, which is never fun to debug in production.

If you want to go even smaller, you can compile with x86_64-unknown-linux-musl to produce a fully statically linked binary, then use FROM scratch as your runtime base. This gives you images under 20 MB, which is impressive. The trade-off is that you have no shell or debugging tools in the container at all, which can make troubleshooting a lot harder when something goes wrong at 2 AM. In my experience, the Debian slim image is a good middle ground for most teams.

Graceful shutdown

When a container orchestrator like Kubernetes, ECS, or Docker Swarm decides to stop our application, it sends a SIGTERM signal and gives the process a grace period (typically 30 seconds) to finish what it’s doing before sending SIGKILL. If our application doesn’t handle SIGTERM, bad things happen: in-flight requests get dropped, database transactions are left in an indeterminate state, and clients see connection resets. You really don’t want that.

The good news is that Axum gives us a built-in mechanism for graceful shutdown:

use tokio::signal;

#[tokio::main]
async fn main() -> anyhow::Result<()> {
    // ... setup code ...

    let listener = tokio::net::TcpListener::bind("0.0.0.0:3000").await?;
    tracing::info!("listening on {}", listener.local_addr()?);

    axum::serve(listener, app)
        .with_graceful_shutdown(shutdown_signal())
        .await?;

    tracing::info!("server shut down cleanly");
    Ok(())
}

async fn shutdown_signal() {
    let ctrl_c = async {
        signal::ctrl_c()
            .await
            .expect("failed to install Ctrl+C handler");
    };

    #[cfg(unix)]
    let terminate = async {
        signal::unix::signal(signal::unix::SignalKind::terminate())
            .expect("failed to install SIGTERM handler")
            .recv()
            .await;
    };

    #[cfg(not(unix))]
    let terminate = std::future::pending::<()>();

    tokio::select! {
        _ = ctrl_c => {},
        _ = terminate => {},
    }

    tracing::info!("shutdown signal received, draining connections");
}

When the shutdown signal fires, Axum stops accepting new connections but keeps processing requests that are already in flight. Once all active requests finish (or the grace period expires), the server exits cleanly.

This matters a lot for database operations. If a request is in the middle of a transaction when SIGKILL arrives, the database will eventually roll back the transaction after its connection timeout, but the client sees an error. With graceful shutdown, the transaction completes normally and the client gets a proper response. It’s one of those things that might seem like a minor detail, but it makes a real difference in production.

Health checks

Health check endpoints let our container orchestrator or load balancer know whether the application is alive and ready to serve traffic. Most orchestrators distinguish between two types of probes, and it’s worth understanding the difference because getting this wrong can cause some really confusing behavior.

Liveness probes answer the question “is the process alive and not stuck?” If the liveness probe fails, the orchestrator restarts the container. This probe should be simple and fast. Here’s the important part: don’t check external dependencies in your liveness probe. If your database is temporarily unavailable, you don’t want the orchestrator to kill and restart your app, because that just adds more load to an already stressed database.

#![allow(unused)]
fn main() {
async fn health_live() -> StatusCode {
    StatusCode::OK
}
}

Readiness probes answer a different question: “is this instance ready to receive traffic?” If the readiness probe fails, the orchestrator stops routing traffic to this instance but doesn’t restart it. This is where we check that the database connection is healthy, that any required caches are reachable, and that the application has finished its startup initialization.

#![allow(unused)]
fn main() {
async fn health_ready(State(state): State<AppState>) -> StatusCode {
    let db_ok = sqlx::query("SELECT 1")
        .execute(&state.db)
        .await
        .is_ok();

    if db_ok {
        StatusCode::OK
    } else {
        StatusCode::SERVICE_UNAVAILABLE
    }
}
}

We mount these endpoints outside our versioned API routes:

#![allow(unused)]
fn main() {
let app = Router::new()
    .route("/health", get(health_live))
    .route("/health/ready", get(health_ready))
    .nest("/api/v1", api_routes())
    .with_state(state);
}

In a Kubernetes deployment, you’d configure the probes in your pod spec like this:

livenessProbe:
  httpGet:
    path: /health
    port: 3000
  initialDelaySeconds: 5
  periodSeconds: 10

readinessProbe:
  httpGet:
    path: /health/ready
    port: 3000
  initialDelaySeconds: 5
  periodSeconds: 10

Environment-based configuration

As we covered in the Configuration chapter, our application reads all its configuration from environment variables. This means the same Docker image can be deployed to development, staging, and production. The only thing that changes between environments is the set of environment variables the deployment platform provides.

Don’t bake environment-specific configuration into the Docker image. I’ve seen teams do this and it always leads to trouble. The image should be an immutable artifact that you build once and deploy everywhere. When you promote a tested image from staging to production, you want to know with confidence that the binary is identical.

Running migrations

Our database migrations should run as part of the application startup, before the server begins accepting traffic. We handle this with the sqlx::migrate! macro as described in the Database Layer chapter.

In a Kubernetes environment, you could alternatively run migrations as an init container or a pre-deployment job. That approach is more explicit and lets you separate migration failures from application startup failures. But for most applications, what I’ve found is that running migrations at startup is simpler and works well, especially when you combine it with the readiness probe. The readiness probe ensures our application doesn’t receive traffic until migrations are complete and the database is accessible, which gives us a nice safety net.

What about Shuttle?

Shuttle is a deployment platform built specifically for Rust. It handles infrastructure provisioning, database setup, and deployment with minimal configuration. If you don’t need fine-grained control over your deployment infrastructure, it can take a lot of the operational burden off your shoulders.

That said, I think understanding how to deploy with Docker and Kubernetes is valuable regardless of which platform you end up using. The underlying concepts we covered here (graceful shutdown, health checks, environment-based configuration, immutable images) apply everywhere. Even if you use Shuttle or a similar platform, knowing what’s happening under the hood will help you debug issues when they come up.

Async Pitfalls and Cancellation Safety

If you’ve written async Rust for any length of time, you’ve probably hit a bug that made no sense at first. Data disappeared. A stream got corrupted. Everything looked correct, but something silently dropped work on the floor. In my experience, the culprit is almost always cancellation: when a future gets dropped before it finishes, the work it was doing just stops, and your program can end up in a state you never planned for.

We’ll walk through the patterns that trip people up here, even experienced Rust developers. We’re going to focus on tokio::select!, cancellation safety, and the structured concurrency primitives that help you manage concurrent work without losing your mind.

How `tokio::select!` works

tokio::select! is our main tool for waiting on multiple async operations at once. It polls all of its branches, and when the first one completes, it drops the rest and runs the matching handler.

#![allow(unused)]
fn main() {
tokio::select! {
    msg = rx.recv() => {
        // A message arrived from the channel.
        handle_message(msg).await;
    }
    _ = shutdown.cancelled() => {
        // A shutdown signal was received.
        tracing::info!("shutting down");
        return;
    }
}
}

The important thing here is what happens to the branch that loses. When shutdown.cancelled() completes first, the rx.recv() future gets dropped. For recv(), that’s totally fine: the channel still holds whatever messages were in its buffer, and the next call to recv() picks up where we left off. That property is what we call cancellation safety.

What “cancellation safe” means

A future is cancellation safe if dropping it partway through doesn’t lose data or leave shared state in a broken condition. The Tokio docs explicitly mark each method with whether it’s cancellation safe, and I’d strongly recommend checking before you put any operation inside a select! branch.

Operations that are cancellation safe include channel.recv(), TcpListener::accept(), sleep(), and CancellationToken::cancelled(). Dropping these just means “I stopped waiting,” and nothing is lost.

Operations that are not cancellation safe include read_exact(), write_all(), and read_to_string(). Think about what happens if write_all has written 50 bytes of a 100-byte message when select! drops it. Those 50 bytes are already on the wire, and you have no way to know how many were sent. The next attempt to write starts the full message again, corrupting the stream.

Let’s look at a concrete example of this problem:

#![allow(unused)]
fn main() {
// BROKEN: buf is not cancel-safe inside select!
let mut buf = vec![0u8; 1024];
loop {
    tokio::select! {
        // If this branch loses, we lose track of how many bytes
        // were read. The next iteration starts over with an empty buf.
        result = socket.read_exact(&mut buf) => {
            process(&buf).await;
        }
        _ = shutdown.cancelled() => {
            return;
        }
    }
}
}

The fix is to move the non-cancel-safe operation out of select! entirely, so it always runs to completion. We keep only cancel-safe operations (like channel receives) inside our select! branches:

#![allow(unused)]
fn main() {
loop {
    tokio::select! {
        result = rx.recv() => {
            // Only cancel-safe operations in select! branches.
            if let Some(data) = result {
                process(&data).await;
            }
        }
        _ = shutdown.cancelled() => {
            return;
        }
    }
}
}

The actor model as a cancellation-resilient pattern

What I’ve found to be the most reliable way to avoid cancellation bugs is to structure your concurrent components as actors. Each actor runs a loop that receives a message, processes it to completion, and then goes back to waiting for the next one. Because select! only runs between loop iterations (when the actor is waiting for input, not in the middle of doing work), there’s nothing to cancel partway through.

#![allow(unused)]
fn main() {
async fn run_worker(
    mut commands: mpsc::Receiver<Command>,
    shutdown: CancellationToken,
) {
    loop {
        let command = tokio::select! {
            cmd = commands.recv() => {
                match cmd {
                    Some(c) => c,
                    None => return, // Channel closed, all senders dropped.
                }
            }
            _ = shutdown.cancelled() => {
                tracing::info!("worker shutting down");
                return;
            }
        };

        // This runs to completion before the next select! iteration.
        // No cancellation risk here.
        process_command(command).await;
    }
}
}

This is the pattern we used extensively in topos-protocol/topos, where each subsystem (broadcast, synchronizer, API) runs an event loop that multiplexes commands, shutdown signals, and timers through select!, but processes each event to completion before going back to the top of the loop. It might seem like extra structure at first, but it pays for itself immediately in bugs you never have to chase.

Structured concurrency with CancellationToken

Once your application has multiple concurrent subsystems (an HTTP server, a background worker, a metrics reporter), you need a way to tell all of them to shut down and then wait for them to actually finish. This is where tokio_util::sync::CancellationToken comes in.

#![allow(unused)]
fn main() {
use tokio_util::sync::CancellationToken;

let root_token = CancellationToken::new();

// Each subsystem gets a child token. Cancelling the root
// cancels all children, but cancelling a child does not
// affect siblings or the parent.
let http_token = root_token.child_token();
let worker_token = root_token.child_token();
let metrics_token = root_token.child_token();

// Spawn subsystems with their tokens.
let http_handle = tokio::spawn(run_http_server(http_token));
let worker_handle = tokio::spawn(run_background_worker(worker_token));
let metrics_handle = tokio::spawn(run_metrics_reporter(metrics_token));

// When a shutdown signal arrives, cancel the root.
wait_for_signal().await;
root_token.cancel();

// Wait for all subsystems to finish their cleanup.
let _ = tokio::join!(http_handle, worker_handle, metrics_handle);
tracing::info!("all subsystems shut down cleanly");
}

The parent-child relationship ensures that shutdown propagates downward. Each subsystem checks token.cancelled() in its event loop (via select!) and starts its cleanup when it fires. The parent waits for all handles to resolve, so we know no work gets abandoned.

We’ll dig into this pattern much more in the Graceful Shutdown chapter, where we’ll build a full shutdown sequence for a real application.

Common mistakes

Using select! when you mean join!. You might wonder why your second operation never finishes. If you want to run two operations concurrently and wait for both, use tokio::join! or tokio::try_join!. select! returns when the first one completes and drops the other. I’ve seen this mix-up cause lost work more times than I can count.

Forgetting that select! in a loop needs all branches to be cancel-safe. Each time the loop iterates, the futures from the previous iteration are dropped. If one of those futures wasn’t cancel-safe, data can silently disappear between iterations. The actor pattern we looked at above avoids this by keeping all select! branches limited to cancel-safe operations (channel receives, timer ticks, cancellation signals).

Holding a MutexGuard across an await point. With std::sync::Mutex, holding the guard across an .await can block the entire runtime thread. With tokio::sync::Mutex, it’s safe but can lead to long hold times if the future inside the critical section takes a while. In both cases, I’d recommend the actor/channel pattern instead: send a message to a task that owns the state, rather than locking shared state from multiple tasks.

Not handling the else branch in select!. If all channels in a select! are closed (every recv() returns None), and there’s no else branch, the select! will panic. In production code, always either handle the None case explicitly or make sure at least one branch (like a cancellation token) can never close.

`tokio::pin!` and when you need it

Some futures need to be pinned before you can use them in select!, because select! requires that its branch futures implement Unpin or are pinned. You’ll most often run into this when you store a future in a variable and want to reuse it across loop iterations:

#![allow(unused)]
fn main() {
let sleep_future = tokio::time::sleep(Duration::from_secs(30));
tokio::pin!(sleep_future);

loop {
    tokio::select! {
        _ = &mut sleep_future => {
            tracing::info!("timeout reached");
            break;
        }
        msg = rx.recv() => {
            // Process message. The sleep future continues
            // from where it left off on the next iteration.
        }
    }
}
}

Without tokio::pin!, the compiler will reject this with an error about Unpin bounds. If you haven’t seen that error yet, don’t worry; you will soon. The pin ensures the future’s memory location stays stable across await points, which is a requirement of the select! macro’s polling mechanism. It’s one of those things that feels confusing the first time but becomes second nature once you’ve seen the pattern a few times.

Message Passing and Channel Patterns

At some point, your async tasks need to talk to each other. When that moment arrives, you basically have two options: shared state protected by a lock, or message passing through channels. Both have their place. But in my experience, channels are the better default for most web application work because they naturally enforce sequential processing, make ownership transfer explicit, and play nicely with tokio::select!.

In this chapter, we’ll walk through the channel primitives Tokio gives us, the patterns they enable, and how to combine them into an actor/service-worker model that I’ve found underpins most well-structured concurrent Rust applications.

The three channel types

Tokio gives us three channel types, and each one is designed for a different communication shape. Let’s look at what they do and when you’d reach for each one.

mpsc: many producers, single consumer

mpsc is the workhorse. Multiple tasks can send messages into it, and one task receives and processes them in order. If you’re building command queues, work dispatching, or really any coordination pattern, this is probably where you’ll start.

#![allow(unused)]
fn main() {
use tokio::sync::mpsc;

let (tx, mut rx) = mpsc::channel::<Command>(32); // bounded, capacity 32

// Senders can be cloned and shared across tasks.
let tx2 = tx.clone();

// The receiver processes commands one at a time.
while let Some(cmd) = rx.recv().await {
    process(cmd).await;
}
}

One thing that really matters in production is whether you pick a bounded or unbounded channel. A bounded channel (created with mpsc::channel(capacity)) gives you natural backpressure: when the channel is full, send().await blocks the sender until the receiver catches up. This prevents a fast producer from overwhelming a slow consumer with an ever-growing queue. An unbounded channel (created with mpsc::unbounded_channel()) never blocks the sender, but it can eat unbounded memory if the receiver falls behind. I’d recommend bounded channels by default, and only reaching for unbounded when you have a specific reason.

oneshot: single-shot request-response

You’ll reach for oneshot when you need exactly one response to a request. The pattern looks like this: you create a oneshot channel, send the Sender half alongside your command, and then await the Receiver to get the result back.

#![allow(unused)]
fn main() {
use tokio::sync::oneshot;

enum Command {
    Get {
        key: String,
        reply: oneshot::Sender<Option<String>>,
    },
    Set {
        key: String,
        value: String,
    },
}

// Caller side: send a command and wait for the response.
async fn get_value(tx: &mpsc::Sender<Command>, key: String) -> Option<String> {
    let (reply_tx, reply_rx) = oneshot::channel();
    tx.send(Command::Get { key, reply: reply_tx }).await.ok()?;
    reply_rx.await.ok()?
}
}

This is how you build request-response communication over an mpsc command channel. The caller creates a oneshot pair, bundles the Sender into the command, and awaits the Receiver. On the other end, the processing task receives the command, does its work, and sends the result back through the oneshot. It might seem like a lot of ceremony at first, but you’ll find it becomes second nature quickly.

broadcast: one-to-many fan-out

broadcast channels let one sender deliver every message to all active receivers. Each receiver gets its own copy of every message sent after it subscribed.

#![allow(unused)]
fn main() {
use tokio::sync::broadcast;

let (tx, _rx) = broadcast::channel::<Event>(100);

// Each subscriber gets a receiver by calling subscribe().
let mut rx1 = tx.subscribe();
let mut rx2 = tx.subscribe();

// Both rx1 and rx2 will receive this event.
tx.send(Event::CertificateReady { id: cert_id })?;
}

Broadcast channels are great for distributing events, invalidation signals, or configuration changes to multiple consumers. One thing to watch out for: if a receiver falls behind, it’ll see a RecvError::Lagged(n) error telling you how many messages it missed. You’ll want to make sure your consumers handle that case gracefully.

The actor/service-worker pattern

Now we get to the pattern I’m most excited about. The actor (or service-worker) model is the most important thing channels enable, and once you see it, you’ll start using it everywhere. The idea is simple: an actor is a task that owns some state, receives commands through a channel, and processes them one at a time. External code interacts with the actor through a client struct that wraps the channel sender and gives you a nice typed async API.

Let’s look at a complete example of a simple cache actor:

#![allow(unused)]
fn main() {
use std::collections::HashMap;
use tokio::sync::{mpsc, oneshot};

// The commands the actor understands.
enum CacheCommand {
    Get {
        key: String,
        reply: oneshot::Sender<Option<String>>,
    },
    Set {
        key: String,
        value: String,
    },
    Invalidate {
        key: String,
    },
}

// The client that external code uses. It hides the channel details.
#[derive(Clone)]
pub struct CacheClient {
    sender: mpsc::Sender<CacheCommand>,
}

impl CacheClient {
    pub async fn get(&self, key: &str) -> Option<String> {
        let (tx, rx) = oneshot::channel();
        self.sender
            .send(CacheCommand::Get {
                key: key.to_string(),
                reply: tx,
            })
            .await
            .ok()?;
        rx.await.ok()?
    }

    pub async fn set(&self, key: String, value: String) {
        let _ = self
            .sender
            .send(CacheCommand::Set { key, value })
            .await;
    }

    pub async fn invalidate(&self, key: &str) {
        let _ = self
            .sender
            .send(CacheCommand::Invalidate {
                key: key.to_string(),
            })
            .await;
    }
}

// Spawn the actor and return the client.
pub fn spawn_cache(shutdown: CancellationToken) -> CacheClient {
    let (tx, rx) = mpsc::channel(64);
    tokio::spawn(run_cache_actor(rx, shutdown));
    CacheClient { sender: tx }
}

async fn run_cache_actor(
    mut commands: mpsc::Receiver<CacheCommand>,
    shutdown: CancellationToken,
) {
    let mut store: HashMap<String, String> = HashMap::new();

    loop {
        let cmd = tokio::select! {
            cmd = commands.recv() => match cmd {
                Some(c) => c,
                None => return, // All clients dropped.
            },
            _ = shutdown.cancelled() => return,
        };

        match cmd {
            CacheCommand::Get { key, reply } => {
                let _ = reply.send(store.get(&key).cloned());
            }
            CacheCommand::Set { key, value } => {
                store.insert(key, value);
            }
            CacheCommand::Invalidate { key } => {
                store.remove(&key);
            }
        }
    }
}
}

So why is this pattern such a natural fit for production code? Let me walk through what makes it work so well.

No locking. The HashMap is owned exclusively by the actor task. No Mutex, no RwLock, no contention. The mpsc channel serializes access for us naturally.

Cancellation resilient. Each command is processed to completion before the next select! iteration. There’s no risk of dropping a future in the middle of a state mutation, which is a subtle bug that can haunt you with other approaches.

Testable. You can test the actor by creating a channel, sending commands, and asserting on the responses. No need to reason about concurrent timing, which is a huge win.

Composable. The CacheClient can live in your AppState and get extracted in Axum handlers just like any other shared dependency. It’s Clone and Send, which is all Axum needs.

When to use channels vs. shared state

You might wonder when to use channels versus just slapping a Mutex on your state. Here’s how I think about it.

Channels and the actor pattern are the right tool when:

The state needs sequential processing (commands must not interleave)
You want to encapsulate state behind an async API
The access pattern involves both reads and writes that need to be coordinated
You’re already using tokio::select! to multiplex multiple event sources

Shared state (Arc<RwLock<T>>, Arc<DashMap<K, V>>) is the right tool when:

The access pattern is overwhelmingly reads with rare writes
You need many tasks to read concurrently without serialization
The state is simple enough that a lock doesn’t introduce significant complexity
You don’t need to coordinate the state with other event sources in a select! loop

One more thing before we move on: for application configuration, a watch channel is also worth knowing about. It holds a single value that any number of receivers can observe, and receivers always see the most recent value. This is really useful for runtime-reloadable configuration or health status that many tasks need to check. We’ll see more of this kind of coordination in the chapters ahead.

Graceful Shutdown and Lifecycle Management

In the Deployment chapter, we looked at how to shut down a single Axum server cleanly when a SIGTERM arrives. That covers the simplest case, but production applications are rarely just an HTTP server. You’ve got background workers processing jobs, open WebSocket connections draining messages, metrics reporters flushing data, and database pools that need to close cleanly. If any of these subsystems is still doing work at the moment the process exits, you end up with lost data, broken connections, or inconsistent state.

In this chapter, we’ll coordinate shutdown across multiple concurrent subsystems so that each one finishes its in-flight work in the right order before the process exits.

CancellationToken as the coordination primitive

If you’ve tried to coordinate shutdown before, you might have reached for oneshot channels or Arc<AtomicBool> flags. Those work, but they get messy fast once you have more than two subsystems. The tokio_util::sync::CancellationToken is a much better fit here. It gives you a cancelled() future that resolves when the token is cancelled, and it supports parent-child relationships where cancelling a parent automatically cancels all children.

#![allow(unused)]
fn main() {
use tokio_util::sync::CancellationToken;

// Create a root token for the entire application.
let root_token = CancellationToken::new();

// Each subsystem gets a child token.
let http_token = root_token.child_token();
let worker_token = root_token.child_token();
let telemetry_token = root_token.child_token();
}

When a shutdown signal arrives, we cancel the root token, and every subsystem sees it through its child token. The token is cancellation-aware and integrates naturally with tokio::select!, which makes the code much easier to follow than the alternatives.

A multi-subsystem application

Let’s look at a realistic main function that starts three subsystems and coordinates their shutdown:

use tokio_util::sync::CancellationToken;

#[tokio::main]
async fn main() -> anyhow::Result<()> {
    let config = Config::from_env()?;
    init_tracing(&config.log_level);

    let pool = create_pool(config.database_url.expose_secret()).await?;
    sqlx::migrate!("./migrations").run(&pool).await?;

    let root_token = CancellationToken::new();

    // Spawn the HTTP server.
    let http_token = root_token.child_token();
    let http_handle = tokio::spawn({
        let pool = pool.clone();
        let config = config.clone();
        async move {
            let app = build_router(config, pool);
            let listener = tokio::net::TcpListener::bind("0.0.0.0:3000")
                .await
                .expect("failed to bind");
            axum::serve(listener, app)
                .with_graceful_shutdown(http_token.cancelled_owned())
                .await
                .expect("server error");
        }
    });

    // Spawn a background job worker.
    let worker_token = root_token.child_token();
    let worker_handle = tokio::spawn({
        let pool = pool.clone();
        async move {
            run_job_worker(pool, worker_token).await;
        }
    });

    // Spawn a metrics reporter.
    let metrics_token = root_token.child_token();
    let metrics_handle = tokio::spawn(async move {
        run_metrics_reporter(metrics_token).await;
    });

    // Wait for the shutdown signal, then cancel everything.
    wait_for_signal().await;
    tracing::info!("shutdown signal received");
    root_token.cancel();

    // Wait for all subsystems to finish.
    let _ = tokio::join!(http_handle, worker_handle, metrics_handle);

    // Close the database pool after all subsystems have stopped.
    pool.close().await;

    tracing::info!("shutdown complete");
    Ok(())
}

The key thing to notice here is that the database pool is closed after all subsystems have finished. Why? Because those subsystems may still be executing queries during their drain phase. If you close the pool first, those queries will fail. This is what I mean by ordered teardown, and it’s worth looking at more closely.

Ordered teardown

The general rule is: release resources in the reverse order of their dependency chain. If our HTTP handlers depend on the database pool, and the metrics reporter depends on the HTTP server being active, our teardown order should be:

Stop accepting new HTTP connections (Axum handles this when the shutdown future resolves)
Drain in-flight HTTP requests to completion
Stop the background worker (it finishes its current job, then exits)
Flush the metrics reporter (it sends any buffered data, then exits)
Close the database pool

In practice, steps 2 through 4 happen concurrently (via tokio::join!) because each subsystem manages its own drain independently. All we need to make sure of is that shared resources like the pool aren’t closed until every consumer has stopped.

Managing dynamic task pools with FuturesUnordered

Some subsystems don’t have a fixed number of tasks. You might have one task per active WebSocket connection, one per in-flight background job, one per connected gRPC stream. The count changes constantly. For these situations, FuturesUnordered is a great tool because it gives you a stream of task completions that you can drain during shutdown.

#![allow(unused)]
fn main() {
use futures::stream::FuturesUnordered;
use futures::StreamExt;

async fn run_connection_manager(
    mut new_connections: mpsc::Receiver<TcpStream>,
    shutdown: CancellationToken,
) {
    let mut active_tasks = FuturesUnordered::new();

    loop {
        tokio::select! {
            // Accept new connections while running.
            conn = new_connections.recv() => {
                if let Some(stream) = conn {
                    let token = shutdown.child_token();
                    active_tasks.push(tokio::spawn(
                        handle_connection(stream, token)
                    ));
                }
            }

            // Reap completed tasks to free resources.
            Some(result) = active_tasks.next() => {
                if let Err(e) = result {
                    tracing::error!(error = ?e, "connection task panicked");
                }
            }

            // On shutdown, stop accepting new connections
            // and drain the remaining tasks.
            _ = shutdown.cancelled() => {
                tracing::info!(
                    active = active_tasks.len(),
                    "shutting down, draining active connections"
                );
                break;
            }
        }
    }

    // Drain: wait for all active connections to finish.
    while let Some(result) = active_tasks.next().await {
        if let Err(e) = result {
            tracing::error!(error = ?e, "connection task panicked during drain");
        }
    }

    tracing::info!("all connections drained");
}
}

I want to highlight the two-phase structure here, because it’s a pattern you’ll use a lot. During normal operation, the select! loop accepts new connections, reaps completed ones, and watches for the shutdown signal. Once shutdown fires, the loop breaks and we enter the drain phase, which just waits for all remaining tasks to complete. Each connection task receives a child cancellation token so it can clean up its own resources (flushing buffers, sending close frames) before returning.

Connection cleanup

For long-lived connections like WebSockets, the cleanup sequence inside each connection task matters more than you might expect. The mozilla-services/autopush-rs project is a good example of doing this thoroughly: when a WebSocket connection shuts down, it disconnects from the client registry, drains any remaining server notifications through on_server_notif_shutdown(), calls client.shutdown() with the error details, and then closes the session with the appropriate WebSocket close reason. This reverse-order cleanup makes sure no notifications are lost between the shutdown decision and the actual connection close.

The principle I’d encourage you to follow: when a connection or task shuts down, finish all pending outbound work (flush write buffers, send remaining notifications, acknowledge pending messages) before closing the underlying transport. It might seem tedious to think through all these steps, but in my experience, the bugs you get from skipping cleanup are much harder to debug than the cleanup code itself.

Testing shutdown behavior

You might wonder how to test any of this. Shutdown logic is tricky because it involves timing, concurrency, and edge cases that only surface under specific orderings. What I’ve found works well is using tokio::time::timeout to make sure your shutdown sequence completes within a reasonable duration:

#![allow(unused)]
fn main() {
#[tokio::test]
async fn shutdown_completes_within_timeout() {
    let token = CancellationToken::new();
    let handle = tokio::spawn(run_my_subsystem(token.clone()));

    // Give it a moment to start up.
    tokio::time::sleep(Duration::from_millis(100)).await;

    // Trigger shutdown.
    token.cancel();

    // It should finish within 5 seconds.
    let result = tokio::time::timeout(
        Duration::from_secs(5),
        handle,
    ).await;

    assert!(result.is_ok(), "shutdown did not complete within timeout");
}
}

If this test hangs, you know your shutdown path has a bug. Something is waiting on a channel that will never send, or a task isn’t checking its cancellation token. It’s a simple test, but it catches real problems. And once you have it in place, you’ll be much more confident making changes to your shutdown logic later on.

Background Jobs and Task Management

If you’ve built any web application beyond a toy project, you’ve run into this: some work just doesn’t belong in the request-response cycle. Sending confirmation emails, processing uploaded files, generating reports, retrying failed webhook deliveries, cleaning up expired sessions. All of these would slow down our HTTP responses if we tried to do them inline. In this chapter, we’ll walk through the patterns for managing background work in a Tokio-based application, starting with simple fire-and-forget tasks and working our way up to persistent, retry-aware job queues.

In-process task spawning

The simplest way to run background work is tokio::spawn. It creates a new async task that runs independently of whatever spawned it. The spawning code doesn’t wait for the spawned task to finish unless you explicitly .await the returned JoinHandle.

#![allow(unused)]
fn main() {
async fn create_user(
    State(state): State<AppState>,
    ValidatedJson(payload): ValidatedJson<CreateUserDto>,
) -> AppResult<(StatusCode, Json<UserResponse>)> {
    let user = state.user_service.register(payload).await?;

    // Send the welcome email in the background.
    // The HTTP response does not wait for this to finish.
    let email_client = state.email_client.clone();
    let email = user.email().clone();
    tokio::spawn(async move {
        if let Err(e) = email_client.send_welcome(&email).await {
            tracing::error!(error = ?e, "failed to send welcome email");
        }
    });

    Ok((StatusCode::CREATED, Json(user.into())))
}
}

This works well for non-critical side effects where the occasional failure is fine (our user still gets created even if the email fails). But there are some important caveats you should know about.

If the spawned task panics, that panic gets captured in the JoinHandle. If you don’t await the handle, the panic is silently swallowed, which can be really confusing when you’re debugging in production. You should either await the handle or use a JoinSet that reports errors back to you.

Here’s the bigger problem, though: if your process restarts (deployment, crash, OOM kill), any in-progress spawned tasks are just gone. There’s no retry, no persistence, no guarantee the work was completed. For operations that must eventually succeed, you’ll need a persistent job queue, which we’ll get to later in this chapter.

JoinSet for managing groups of tasks

When you have a known set of tasks that all need to complete, tokio::task::JoinSet gives us a structured way to spawn them and collect their results:

#![allow(unused)]
fn main() {
use tokio::task::JoinSet;

async fn process_batch(items: Vec<Item>, pool: PgPool) -> anyhow::Result<()> {
    let mut set = JoinSet::new();

    for item in items {
        let pool = pool.clone();
        set.spawn(async move {
            process_item(&pool, &item).await
        });
    }

    // Wait for all tasks to complete.
    while let Some(result) = set.join_next().await {
        match result {
            Ok(Ok(())) => {}
            Ok(Err(e)) => tracing::error!(error = ?e, "item processing failed"),
            Err(e) => tracing::error!(error = ?e, "task panicked"),
        }
    }

    Ok(())
}
}

JoinSet is also handy for limiting concurrency. You can check set.len() before spawning new tasks and wait for one to complete if you’re at capacity. And when the JoinSet is dropped, all tasks in it are aborted, which makes it a natural fit for shutdown scenarios where you want to cancel whatever work is left.

Periodic tasks

A lot of applications need recurring background work: cache cleanup every 5 minutes, metrics reporting every 30 seconds, health pings every 10 seconds. You get the idea. Tokio’s interval is what we reach for here:

#![allow(unused)]
fn main() {
use tokio::time::{interval, Duration};

async fn run_cache_cleanup(
    pool: PgPool,
    shutdown: CancellationToken,
) {
    let mut ticker = interval(Duration::from_secs(300));

    loop {
        tokio::select! {
            _ = ticker.tick() => {
                if let Err(e) = cleanup_expired_sessions(&pool).await {
                    tracing::error!(error = ?e, "session cleanup failed");
                }
            }
            _ = shutdown.cancelled() => {
                tracing::info!("cache cleanup shutting down");
                return;
            }
        }
    }
}
}

One thing that tripped me up early on: interval has a MissedTickBehavior setting that controls what happens when your task takes longer than the interval period. The default is Burst, which fires missed ticks immediately to catch up. For most background maintenance tasks, MissedTickBehavior::Skip is what you actually want, because you’d rather skip the ticks that were missed while the previous run was still going than pile them all up at once.

#![allow(unused)]
fn main() {
let mut ticker = interval(Duration::from_secs(300));
ticker.set_missed_tick_behavior(tokio::time::MissedTickBehavior::Skip);
}

Integrating background workers with Axum

In practice, you’ll start your background workers alongside the HTTP server, sharing the database pool and configuration through the same AppState or by cloning the pool directly. The Graceful Shutdown chapter shows the full pattern for spawning multiple subsystems and coordinating their lifecycle.

Let me walk through the key integration points:

Shared state. Our background worker needs the same database pool, configuration, and possibly the same service instances as the HTTP handlers. Clone these from the same source at startup.

Coordinated shutdown. The worker needs a CancellationToken so it can stop gracefully when the application shuts down. You want the worker to finish its current job before exiting, rather than abandoning work mid-flight.

Error isolation. A panicking background task shouldn’t bring down the HTTP server. By running the worker in a separate tokio::spawn, its failures stay isolated. Log the error and, depending on your requirements, either restart the worker or let it stay down.

Persistent job queues

For work that must survive process restarts, you need a persistent queue backed by a database or a message broker. The good news is that several mature Rust crates already solve this problem for us:

apalis is a type-safe, extensible background processing library that supports multiple storage backends (PostgreSQL, Redis, SQLite) and includes built-in monitoring, metrics, and graceful shutdown. It works with both Tokio and other async runtimes.

fang runs each worker in a separate Tokio task with automatic restart on panic. It supports scheduled jobs via cron expressions, custom retry backoff modes, and unique job deduplication.

backie provides async persistent task processing backed by PostgreSQL, designed for horizontal scaling where workers don’t need to be in the same process as the queue producer.

The general pattern across all of these is the same, and once you see it, it’s pretty intuitive: your HTTP handler (or service) enqueues a job by inserting a row into a jobs table or publishing to a message broker. A separate worker process (or a background task in the same process) polls for pending jobs, processes them, and marks them as completed or failed. Failed jobs get retried with configurable backoff, and after a maximum number of retries they move to a dead-letter state where someone can inspect them manually.

Error handling in background tasks

This is where things get interesting, because unlike HTTP handlers, background tasks have no client to return errors to. When a background task fails, you need a strategy for making that failure visible and actionable. Here’s what I’ve found works well:

Log the error with context. Use tracing::error! with structured fields so the failure shows up in your log aggregation system and can be correlated with the original request that enqueued the work. Without this, you’ll be flying blind when something goes wrong at 2 AM.

Retry with backoff. For transient failures (network timeouts, temporary database unavailability), retry the operation after an increasing delay. Most persistent job queue crates handle this automatically, so you don’t need to build it yourself.

Dead-letter after exhausting retries. If a job fails repeatedly, move it to a dead-letter table or queue where it can be inspected and either fixed or discarded manually. Don’t retry indefinitely, because a permanently failing job will consume worker capacity and delay everything else in the queue.

Emit metrics. Track job success rates, processing duration, queue depth, and retry counts. These metrics are your early warning system for problems that are building up before they become visible to users. In my experience, queue depth trending upward is almost always the first sign that something is going sideways.

gRPC with Tonic

If you’ve been building an Axum service following the patterns in earlier chapters, at some point you’ll run into a familiar question: how do we talk to other internal services? An HTTP/JSON API is great for browser clients and third-party consumers, but for service-to-service communication, you often want something with stronger schema guarantees, built-in streaming, and automatic code generation. That’s where gRPC comes in. Tonic is the go-to gRPC framework in the Rust ecosystem, and since it builds on Tokio just like Axum does, the two play nicely together.

In this chapter, we’ll walk through adding a gRPC surface to our existing Rust service, running it alongside our Axum HTTP server, and applying the same architectural principles we’ve been using all along (thin handlers, domain separation, centralized error handling) to the gRPC side of things.

When to use gRPC

gRPC is a natural fit when your consumers are other services that you control. Both sides benefit from the shared schema defined in .proto files, the protobuf wire format is more compact than JSON, and you get type safety across language boundaries for free. Streaming RPCs (server-streaming, client-streaming, bidirectional) are first-class concepts rather than something you bolt on after the fact.

Where gRPC is less of a fit is browser-based clients (though gRPC-Web bridges that gap), public APIs where curl-ability matters, or simple CRUD services where the protobuf machinery adds more overhead than it’s worth.

In my experience, most production systems end up using both: gRPC for internal communication between services, and HTTP/JSON for the public-facing API. That’s exactly the scenario we’ll tackle here.

Proto file organization

Protocol buffer definitions live in .proto files, typically in a proto/ directory at the root of your project or workspace:

my-service/
├── proto/
│   └── myservice/
│       └── v1/
│           ├── users.proto
│           └── health.proto
├── build.rs
├── Cargo.toml
└── src/
    └── ...

You might wonder why we version the proto package (e.g., myservice.v1) from the start. It’s actually even more important here than it is with REST, because proto schemas get compiled into client code. Once consumers depend on those generated types, changing the schema is a breaking change for every one of them. Versioning from day one saves you from painful migrations later.

A typical proto file looks like this:

syntax = "proto3";

package myservice.v1;

service UserService {
    rpc GetUser (GetUserRequest) returns (GetUserResponse);
    rpc CreateUser (CreateUserRequest) returns (CreateUserResponse);
    rpc ListUsers (ListUsersRequest) returns (stream UserResponse);
}

message GetUserRequest {
    string id = 1;
}

message GetUserResponse {
    string id = 1;
    string name = 2;
    string email = 3;
}

// ... other messages

Build-time code generation

Tonic uses a build script to compile .proto files into Rust code at build time. Let’s add the dependencies to our Cargo.toml:

[dependencies]
tonic = "0.14"
tonic-prost = "0.14"
prost = "0.14"

[build-dependencies]
tonic-prost-build = "0.14"

One thing that might trip you up: as of tonic 0.14, protobuf code generation uses tonic-prost-build, not tonic-build directly. The tonic-build crate still exists as the underlying codegen infrastructure, but tonic-prost-build is what you actually depend on for protobuf compilation. You’ll also need tonic-prost as a runtime dependency alongside tonic itself.

Now let’s create our build.rs:

fn main() -> Result<(), Box<dyn std::error::Error>> {
    tonic_prost_build::configure()
        .build_server(true)
        .build_client(true)
        .compile_protos(
            &["proto/myservice/v1/users.proto"],
            &["proto/"],
        )?;
    Ok(())
}

The generated code ends up in your target/ directory, and you pull it in with tonic::include_proto!("myservice.v1"). If you need the generated types to play nicely with your domain types, you can configure tonic_build to add custom derives (like Eq, Hash, or serde::Serialize).

Implementing a gRPC service

The generated code gives you a trait to implement for your service. Each RPC method becomes a trait method that receives a Request<T> and returns a Result<Response<U>, Status>. If you’ve been writing Axum handlers, this will feel very familiar.

#![allow(unused)]
fn main() {
use tonic::{Request, Response, Status};

// Include the generated code.
pub mod pb {
    tonic::include_proto!("myservice.v1");
}

use pb::user_service_server::{UserService, UserServiceServer};

pub struct MyUserService {
    // Same domain services and repositories you use in Axum handlers.
    inner: domain::services::UserService<PostgresUserRepo>,
}

impl UserService for MyUserService {
    async fn get_user(
        &self,
        request: Request<pb::GetUserRequest>,
    ) -> Result<Response<pb::GetUserResponse>, Status> {
        let req = request.into_inner();
        let id = Uuid::parse_str(&req.id)
            .map_err(|_| Status::invalid_argument("invalid user ID"))?;

        let user = self.inner
            .get_by_id(&UserId::from_uuid(id))
            .await
            .map_err(|e| {
                tracing::error!(error = ?e, "failed to fetch user");
                Status::internal("internal error")
            })?
            .ok_or_else(|| Status::not_found("user not found"))?;

        Ok(Response::new(pb::GetUserResponse {
            id: user.id().as_uuid().to_string(),
            name: user.name().as_str().to_string(),
            email: user.email().as_str().to_string(),
        }))
    }

    // ... other methods
}
}

Notice how this follows the exact same pattern as our Axum handlers: extract input, call the domain service, map the result to the transport format. The differences are small. Errors map to tonic::Status codes rather than HTTP status codes, and request/response types come from protobuf instead of JSON deserialization. But the shape of the code is the same, which is the whole point of our layered architecture.

Error handling in gRPC

tonic::Status is the gRPC equivalent of HTTP status codes, but it also carries an error message. The standard codes include NotFound, InvalidArgument, PermissionDenied, Internal, Unavailable, and others that map fairly directly to HTTP semantics.

If you already have an AppError enum for your Axum handlers (as described in the Error Handling chapter), you can write a conversion from AppError to Status and reuse all that work:

#![allow(unused)]
fn main() {
impl From<AppError> for Status {
    fn from(err: AppError) -> Self {
        match err {
            AppError::NotFound => Status::not_found("resource not found"),
            AppError::Validation(msg) => Status::invalid_argument(msg),
            AppError::Unauthorized => Status::unauthenticated("authentication required"),
            AppError::Forbidden => Status::permission_denied("insufficient permissions"),
            AppError::Conflict(msg) => Status::already_exists(msg),
            AppError::Internal(e) => {
                tracing::error!(error = ?e, "internal error in gRPC handler");
                Status::internal("internal error")
            }
            _ => Status::internal("internal error"),
        }
    }
}
}

With this in place, your gRPC handlers can use the same ? propagation pattern as your Axum handlers. Domain errors automatically map to the right gRPC status codes, and you don’t have to repeat mapping logic everywhere.

Running gRPC alongside Axum

There are two common approaches for running both HTTP and gRPC in the same process. Let’s look at each.

Separate ports is the simpler option, and it’s what I’d recommend starting with. Our Axum server binds to port 3000, our Tonic server binds to port 50051. Each runs in its own tokio::spawn, and they share the same AppState, database pool, and domain services.

#![allow(unused)]
fn main() {
// Spawn the HTTP server.
let http_handle = tokio::spawn(async move {
    let listener = TcpListener::bind("0.0.0.0:3000").await.unwrap();
    axum::serve(listener, http_router)
        .with_graceful_shutdown(http_token.cancelled_owned())
        .await
        .unwrap();
});

// Spawn the gRPC server.
let grpc_handle = tokio::spawn(async move {
    tonic::transport::Server::builder()
        .add_service(UserServiceServer::new(grpc_user_service))
        .serve_with_shutdown(
            "0.0.0.0:50051".parse().unwrap(),
            grpc_token.cancelled_owned(),
        )
        .await
        .unwrap();
});
}

Same port with protocol detection is more complex but reduces your operational surface. Tonic’s Server can be composed with an Axum router so that HTTP/2 gRPC requests and HTTP/1.1 REST requests are handled by the same listener. This requires careful configuration though, and in my experience it’s usually only worth it when you have a strong reason to avoid multiple ports, like a restrictive network policy.

For most applications, go with separate ports. It’s simpler, easier to monitor independently, and you won’t run into protocol-detection edge cases.

Middleware and interceptors

Tonic supports interceptors, which are the gRPC equivalent of middleware. You can add authentication, logging, and other cross-cutting concerns just like you would in Axum:

#![allow(unused)]
fn main() {
use tonic::service::interceptor;

fn auth_interceptor(req: Request<()>) -> Result<Request<()>, Status> {
    let token = req.metadata()
        .get("authorization")
        .and_then(|v| v.to_str().ok())
        .and_then(|v| v.strip_prefix("Bearer "))
        .ok_or_else(|| Status::unauthenticated("missing token"))?;

    // Validate token...

    Ok(req)
}

// Apply the interceptor to a service.
let user_service = UserServiceServer::with_interceptor(
    MyUserService::new(state),
    auth_interceptor,
);
}

For more complex middleware (tracing, metrics, timeouts), Tonic integrates with Tower layers, just like Axum does. You can apply the same TraceLayer, TimeoutLayer, and other Tower middleware to your gRPC server, which means the skills you’ve built up in earlier chapters transfer directly.

gRPC reflection

Adding server reflection lets clients like grpcurl and grpc-ui discover your service’s API without needing the proto files locally. Think of it as the gRPC equivalent of Swagger UI. It’s incredibly useful during development, so I’d recommend setting it up early.

#![allow(unused)]
fn main() {
use tonic_reflection::server::Builder;

let reflection_service = Builder::configure()
    .register_encoded_file_descriptor_set(pb::FILE_DESCRIPTOR_SET)
    .build_v1()?;

tonic::transport::Server::builder()
    .add_service(reflection_service)
    .add_service(UserServiceServer::new(user_service))
    .serve(addr)
    .await?;
}

To generate the file descriptor set, add this to your build.rs:

#![allow(unused)]
fn main() {
tonic_prost_build::configure()
    .file_descriptor_set_path(
        std::path::PathBuf::from(std::env::var("OUT_DIR").unwrap())
            .join("myservice_descriptor.bin")
    )
    .compile_protos(&["proto/myservice/v1/users.proto"], &["proto/"])?;
}

Testing gRPC services

You can test gRPC services by spinning up a real server in the test and connecting to it with a generated client:

#![allow(unused)]
fn main() {
#[tokio::test]
async fn test_get_user() {
    let addr = start_test_grpc_server().await;
    let mut client = UserServiceClient::connect(
        format!("http://{}", addr)
    ).await.unwrap();

    let response = client
        .get_user(pb::GetUserRequest {
            id: "some-uuid".to_string(),
        })
        .await
        .unwrap();

    assert_eq!(response.into_inner().name, "Alice");
}
}

If the network overhead bothers you for fast unit tests, you can also use tonic::transport::Channel with an in-process transport. It’s similar to Axum’s oneshot testing pattern, and it keeps your tests quick without sacrificing coverage.

Outbound HTTP and Service Resilience

So far we’ve been looking at our service as a server, receiving requests and sending responses. But in practice, most services are also clients. Our web server calls payment providers, sends webhooks, queries internal APIs, fetches data from third parties, and pushes messages to queues. I’ve seen outages that had nothing to do with how we handled inbound traffic. The problem was how we called out. If a downstream dependency gets slow and you don’t have the right guardrails, it’ll take your whole service down with it.

In this chapter, we’ll walk through the patterns that keep outbound HTTP calls from ruining your day: timeouts, retries, backoff, idempotency, circuit breaking, and trace context propagation.

reqwest as the HTTP client

The reqwest crate is the go-to HTTP client in the Rust ecosystem. It’s async-native, handles TLS, supports connection pooling, and plays nicely with the Tokio runtime. One thing that trips people up early on: you want to create a single Client instance at startup and share it across your application. The Client manages an internal connection pool, so if you’re creating a new one for every request, you’re throwing away all those pooling benefits.

#![allow(unused)]
fn main() {
use reqwest::Client;
use std::time::Duration;

pub fn create_http_client() -> Client {
    Client::builder()
        .timeout(Duration::from_secs(10))
        .connect_timeout(Duration::from_secs(3))
        .pool_max_idle_per_host(10)
        .build()
        .expect("failed to create HTTP client")
}
}

We’ll store the client in our AppState and extract it in handlers or services that need to make outbound calls. The same client can serve multiple concurrent requests safely, which is exactly what we want.

Timeouts at every level

If I had to pick one resilience mechanism for outbound calls, it would be timeouts. Without them, a slow or unresponsive downstream service can hold your connections open indefinitely. Eventually you’ll exhaust your connection pool, your file descriptors, or your request-handling capacity. I’ve watched this happen in production, and it’s not fun.

There are three levels of timeout you should think about:

Connect timeout controls how long we wait to establish a TCP connection. If the remote server is unreachable or the network is congested, you want to fail fast rather than waiting for the OS-level default (which can be 120 seconds on Linux). Three to five seconds is a reasonable starting point.

Per-request timeout caps the total time for a single HTTP request, including connection, TLS handshake, sending the request, and receiving the full response. Ten to thirty seconds is typical, but it really depends on how fast you expect the downstream to respond.

Overall operation timeout is the deadline for the entire operation, including retries. If your business logic says “fetch the payment status within 30 seconds,” that 30-second budget covers all retry attempts, not just one. Use tokio::time::timeout around the entire retry loop, not around individual attempts.

#![allow(unused)]
fn main() {
use tokio::time::timeout;

async fn fetch_with_deadline(
    client: &Client,
    url: &str,
    deadline: Duration,
) -> anyhow::Result<String> {
    timeout(deadline, async {
        // Individual attempts have their own timeout (via client config).
        // The outer timeout caps the total time including retries.
        let response = client.get(url).send().await?.error_for_status()?;
        Ok(response.text().await?)
    })
    .await
    .map_err(|_| anyhow::anyhow!("operation timed out after {:?}", deadline))?
}
}

Retries and backoff

Transient failures are just a fact of life in distributed systems. Network blips, brief service restarts, load-induced timeouts, all of these produce errors that would succeed if you tried again a moment later. So a well-behaved outbound client should retry these failures, with increasing delays between attempts to avoid piling on.

#![allow(unused)]
fn main() {
use std::time::Duration;
use tokio::time::sleep;

async fn fetch_with_retries(
    client: &Client,
    url: &str,
    max_retries: u32,
) -> anyhow::Result<reqwest::Response> {
    let mut last_error = None;

    for attempt in 0..=max_retries {
        if attempt > 0 {
            // Exponential backoff with jitter to prevent thundering herd.
            let base = Duration::from_millis(100 * 2u64.pow(attempt - 1));
            let jitter = Duration::from_millis(rand::random::<u64>() % 50);
            sleep(base + jitter).await;
        }

        match client.get(url).send().await {
            Ok(resp) if resp.status().is_server_error() => {
                // 5xx errors are retryable.
                last_error = Some(anyhow::anyhow!(
                    "server error: {}", resp.status()
                ));
            }
            Ok(resp) => return Ok(resp),
            Err(e) if e.is_timeout() || e.is_connect() => {
                // Timeouts and connection errors are retryable.
                last_error = Some(e.into());
            }
            Err(e) => return Err(e.into()), // Non-retryable error.
        }
    }

    Err(last_error.unwrap_or_else(|| anyhow::anyhow!("exhausted retries")))
}
}

When you’re building a retry strategy, there are a few key decisions to make:

Which errors to retry. Retry connection errors, timeouts, and 5xx responses. Don’t retry 4xx responses, because the request was wrong, and sending it again won’t help. Same goes for errors that indicate a permanent problem, like DNS resolution failure for a domain that doesn’t exist.

Backoff strategy. Exponential backoff (doubling the delay each attempt) with jitter (adding a small random component) is the standard approach. The jitter is important because it prevents the thundering herd problem, where many clients retry at exactly the same time after an outage and overwhelm the recovering service.

Maximum attempts. Three to five retries is typical. More than that usually means the downstream has a sustained outage, and continuing to retry just adds load without improving anything.

Idempotency of retried operations

Retrying a GET request is safe because GET is idempotent by definition. But retrying a POST that creates a resource? That’s a different story. Unless the downstream API supports idempotency keys, a retried POST can create duplicate records, which is probably the last thing you want.

If you’re calling an API that supports idempotency keys (Stripe, for example, accepts an Idempotency-Key header), the approach is straightforward: generate a UUID for the operation and include it on every attempt:

#![allow(unused)]
fn main() {
let idempotency_key = Uuid::new_v4().to_string();

for attempt in 0..=max_retries {
    let result = client
        .post(url)
        .header("Idempotency-Key", &idempotency_key)
        .json(&payload)
        .send()
        .await;

    // Same retry logic as above.
}
}

What if the downstream API doesn’t support idempotency keys? You have two options: design the operation to be naturally idempotent (use PUT with a known ID rather than POST to a collection), or don’t retry write operations at all and surface the failure for manual resolution. Neither is perfect, but both are better than silently creating duplicate charges.

Circuit breaking

You might wonder: if a downstream service is consistently failing, why keep sending requests to it? You’re wasting your own resources and adding load to a system that’s already struggling. This is where circuit breakers come in. A circuit breaker tracks failure rates and, when they cross a threshold, short-circuits requests for a cool-down period before trying again.

The tower ecosystem provides tower-resilience with circuit breaker, bulkhead, and retry middleware that you can layer onto a reqwest client or any Tower service. For simpler needs, you can implement a basic circuit breaker with an AtomicU32 failure counter and a tokio::time::Instant for the last-check timestamp.

The three states are:

Closed (normal): requests flow through. Failures increment a counter.
Open (tripped): requests are rejected immediately without calling the downstream. After a timeout, the breaker moves to half-open.
Half-open (probing): a single request is allowed through. If it succeeds, the breaker closes. If it fails, it reopens.

Propagating trace context

When our service calls another service, we want the trace context (trace ID, span ID, and sampling decision) to travel with the request. This is what lets you correlate logs and spans across service boundaries, and it’s what makes distributed tracing actually useful rather than just decoration.

If both services use OpenTelemetry, the trace context is propagated via the traceparent HTTP header (W3C Trace Context format). The reqwest-tracing or reqwest-middleware crates can inject this header automatically, which is nice. If you’re using a simpler setup, at minimum propagate your request ID as a custom header so you can correlate logs manually. You’ll thank yourself at 2am when you’re debugging a production issue.

#![allow(unused)]
fn main() {
let request_id = "req-abc-123"; // from the inbound request's span

let response = client
    .get(url)
    .header("x-request-id", request_id)
    .send()
    .await?;
}

When not to retry

Not every failure should be retried. This is one of those things that seems obvious when you read it, but in my experience, it’s easy to get wrong. Retrying aggressively against a service that’s already overloaded just makes the overload worse. Here are the situations where retrying is the wrong call:

4xx client errors (except 429 Too Many Requests, which is explicitly asking you to back off and try later)
Errors that indicate a permanent problem (invalid credentials, resource deleted, schema mismatch)
When the operation deadline has expired (there’s no point starting a new attempt if the caller has already given up)
When the circuit breaker is open (the downstream is known to be unhealthy)

The general rule I follow: retry transient failures with backoff, respect rate limits, and fail fast when the problem is structural. Let’s look at how we wrap all of this up into something clean.

Wrapping outbound clients

For downstream services that your application calls frequently, I’d recommend wrapping the raw reqwest::Client in a dedicated client struct. This struct encapsulates the base URL, authentication, timeout policy, retry logic, and error mapping all in one place. Think of it as the outbound equivalent of the repository pattern we use for database access.

#![allow(unused)]
fn main() {
pub struct PaymentClient {
    http: Client,
    base_url: String,
    api_key: String,
}

impl PaymentClient {
    pub async fn charge(
        &self,
        amount_cents: i64,
        currency: &str,
        idempotency_key: &str,
    ) -> Result<PaymentResult, PaymentError> {
        let url = format!("{}/v1/charges", self.base_url);

        let response = self.http
            .post(&url)
            .bearer_auth(&self.api_key)
            .header("Idempotency-Key", idempotency_key)
            .json(&serde_json::json!({
                "amount": amount_cents,
                "currency": currency,
            }))
            .send()
            .await
            .map_err(PaymentError::Network)?;

        match response.status() {
            s if s.is_success() => {
                let result = response.json().await.map_err(PaymentError::Parse)?;
                Ok(result)
            }
            StatusCode::UNPROCESSABLE_ENTITY => {
                let body = response.text().await.unwrap_or_default();
                Err(PaymentError::Validation(body))
            }
            s => Err(PaymentError::Upstream(s)),
        }
    }
}
}

This gives our domain services a clean, typed interface for the downstream dependency. It hides the HTTP details, and it gives us a natural place to layer in retries, circuit breaking, and metrics later without touching the calling code. What I’ve found is that this pattern pays for itself quickly, especially as the number of downstream dependencies grows.

Typestate and Advanced Type System Patterns

In the Domain Modeling chapter, we used newtypes to enforce invariants on individual values: an Email can only be constructed from a valid email string, and once you have one, its validity is guaranteed. Now we’re going to take that idea further. Instead of encoding what a value is, we’ll encode what state an entity is in, and use the type system to control which operations are available in each state.

The payoff? Invalid state transitions become compile errors rather than runtime bugs.

The typestate pattern

The core idea is straightforward: we represent an entity’s state as a type parameter, and then define methods that are only available when the entity is in the right state. When you transition to a new state, the old value gets consumed and you get back a new one with a different type parameter. Let’s look at what this looks like in practice.

#![allow(unused)]
fn main() {
use std::marker::PhantomData;

// States are zero-sized types. They exist only at compile time.
pub struct Created;
pub struct Paid;
pub struct Shipped;

pub struct Order<State> {
    id: Uuid,
    customer_id: Uuid,
    total_cents: i64,
    _state: PhantomData<State>,
}

impl Order<Created> {
    pub fn new(customer_id: Uuid, total_cents: i64) -> Self {
        Self {
            id: Uuid::new_v4(),
            customer_id,
            total_cents,
            _state: PhantomData,
        }
    }

    /// Pay for the order. Consumes the Created order
    /// and returns a Paid order.
    pub fn pay(self, payment_id: String) -> Order<Paid> {
        tracing::info!(order_id = %self.id, "order paid");
        Order {
            id: self.id,
            customer_id: self.customer_id,
            total_cents: self.total_cents,
            _state: PhantomData,
        }
    }
}

impl Order<Paid> {
    /// Ship the order. Only available on paid orders.
    pub fn ship(self, tracking_number: String) -> Order<Shipped> {
        tracing::info!(order_id = %self.id, "order shipped");
        Order {
            id: self.id,
            customer_id: self.customer_id,
            total_cents: self.total_cents,
            _state: PhantomData,
        }
    }

    /// Refund the order. Only available on paid (not yet shipped) orders.
    pub fn refund(self) -> Order<Created> {
        tracing::info!(order_id = %self.id, "order refunded");
        Order {
            id: self.id,
            customer_id: self.customer_id,
            total_cents: self.total_cents,
            _state: PhantomData,
        }
    }
}

impl Order<Shipped> {
    pub fn tracking_number(&self) -> &str {
        // In a real implementation, this would be stored in the struct.
        "tracking info"
    }
}
}

With this design, the happy path compiles just fine:

#![allow(unused)]
fn main() {
let order = Order::new(customer_id, 4999);
let paid_order = order.pay("pay_123".into());
let shipped_order = paid_order.ship("TRACK456".into());
}

But try to skip a step, and you won’t get past the compiler:

#![allow(unused)]
fn main() {
let order = Order::new(customer_id, 4999);
// Error: no method named `ship` found for `Order<Created>`
let shipped = order.ship("TRACK456".into());
}

The compiler catches the invalid transition because ship() is only defined on Order<Paid>, not on Order<Created>. You can’t ship an unpaid order, and the type system enforces this without any runtime checks. No if statements, no panics, just a compile error that tells you exactly what went wrong.

Real-world application: connection lifecycle

Where this pattern really shines is protocol implementations where a connection moves through distinct phases with different capabilities. I worked on the mozilla-services/autopush-rs project, which uses this approach for WebSocket connections: an UnidentifiedClient handles the initial handshake, and when a valid Hello message arrives, it transitions to a WebPushClient that has access to the user’s subscription data and notification channels.

Here’s a simplified version of that pattern:

#![allow(unused)]
fn main() {
pub struct Unauthenticated;
pub struct Authenticated;

pub struct Connection<State> {
    stream: WebSocketStream,
    remote_addr: SocketAddr,
    _state: PhantomData<State>,
}

impl Connection<Unauthenticated> {
    pub fn new(stream: WebSocketStream, remote_addr: SocketAddr) -> Self {
        Self {
            stream,
            remote_addr,
            _state: PhantomData,
        }
    }

    /// Perform the authentication handshake.
    /// Consumes the unauthenticated connection and returns
    /// an authenticated one, or an error if auth fails.
    pub async fn authenticate(
        mut self,
        timeout: Duration,
    ) -> Result<(Connection<Authenticated>, UserId), AuthError> {
        let hello = tokio::time::timeout(timeout, self.read_hello())
            .await
            .map_err(|_| AuthError::Timeout)?
            .map_err(AuthError::Protocol)?;

        let user_id = validate_credentials(&hello)?;

        Ok((
            Connection {
                stream: self.stream,
                remote_addr: self.remote_addr,
                _state: PhantomData,
            },
            user_id,
        ))
    }
}

impl Connection<Authenticated> {
    /// Send a push notification. Only available on authenticated connections.
    pub async fn send_notification(&mut self, notif: &Notification) -> Result<(), SendError> {
        self.stream.send(notif.to_message()).await?;
        Ok(())
    }

    /// Graceful close with notification draining.
    pub async fn shutdown(mut self, reason: CloseReason) {
        // Drain any remaining notifications before closing.
        self.drain_pending_notifications().await;
        self.stream.close(reason.into()).await.ok();
    }
}
}

The key insight from autopush-rs is that error handling differs between the two phases. During the unauthenticated phase, a protocol error just disconnects the client (there’s nothing to clean up). But during the authenticated phase, a protocol error needs to trigger a graceful shutdown that drains pending notifications before closing. The typestate pattern makes this distinction structural: the shutdown method with notification draining only exists on Connection<Authenticated>. You literally can’t call it in the wrong phase.

The builder pattern as a degenerate typestate

You might already be using a simplified version of typestate without realizing it: the builder pattern with mandatory fields. By using different type parameters for each stage of the builder, we can ensure that required fields are set before build() becomes available.

#![allow(unused)]
fn main() {
pub struct NoAddr;
pub struct HasAddr;

pub struct ServerBuilder<AddrState> {
    addr: Option<SocketAddr>,
    max_connections: usize,
    _state: PhantomData<AddrState>,
}

impl ServerBuilder<NoAddr> {
    pub fn new() -> Self {
        Self {
            addr: None,
            max_connections: 100,
            _state: PhantomData,
        }
    }

    pub fn bind(self, addr: SocketAddr) -> ServerBuilder<HasAddr> {
        ServerBuilder {
            addr: Some(addr),
            max_connections: self.max_connections,
            _state: PhantomData,
        }
    }
}

impl ServerBuilder<HasAddr> {
    /// build() is only available after bind() has been called.
    pub fn build(self) -> Server {
        Server {
            addr: self.addr.unwrap(), // Safe: guaranteed by typestate.
            max_connections: self.max_connections,
        }
    }
}

// Optional settings are available in any state.
impl<S> ServerBuilder<S> {
    pub fn max_connections(mut self, n: usize) -> Self {
        self.max_connections = n;
        self
    }
}
}

This guarantees at compile time that you can’t call build() without first calling bind(). If you try, the compiler error is clear: “no method named build found for ServerBuilder<NoAddr>.” I love how Rust turns what would be a runtime panic in most languages into something the compiler catches for you.

When typestate is worth the complexity

Now, I don’t want to leave you with the impression that you should reach for typestate everywhere. It adds complexity to your type signatures and makes some common operations harder (like storing a heterogeneous collection of entities in different states). In my experience, it’s worth it when:

Invalid transitions would cause security vulnerabilities or data corruption
There are relatively few states with clear, linear transitions
The code that manages transitions is performance-sensitive enough that you want zero runtime overhead
The types are used within a single module or subsystem, so the generic parameters don’t propagate widely

You’ll want to prefer enum-based state machines when:

You need to store entities in different states together (a Vec<Order> where some are paid and some are shipped)
The number of states is large or the transition graph is complex
The state needs to be serialized to or deserialized from a database or API response
The state is determined at runtime by external data that you can’t know at compile time

What I’ve found in practice is that many systems use both: typestate for the core internal logic where transitions are safety-critical, and an enum wrapper for persistence and external interfaces. The typestate enforces correctness in the code that manages transitions, while the enum gives you the flexibility you need for storage and serialization. We’ll see more of this when we get to the persistence layer.

Connecting back to domain modeling

If you’re seeing a theme here, you’re right. The typestate pattern is a natural extension of the “parse, don’t validate” philosophy from the Domain Modeling chapter. Where newtypes make invalid values unrepresentable (Email can only hold valid emails), typestate makes invalid transitions unrepresentable (Order<Created> has no ship() method). Both patterns use Rust’s type system to move correctness guarantees from runtime checks to compile-time enforcement, and both pay off most in code where getting it wrong would hurt. In the next chapter, we’ll look at how to structure our error types so that when things do go wrong at runtime, we handle it cleanly.

AI-Assisted Development Workflow

If you’ve been building with AI coding agents, you’ve probably noticed that they’re great at cranking out code but terrible at knowing your project’s rules. This chapter is about bridging that gap. We’ll look at how to take the principles from this guide and turn them into repo-level instructions that AI agents (Claude Code, Codex, Cursor, Aider) can consume effectively, so they stop guessing and start following your architecture.

The problem with pasting the whole guide

I get it, the temptation is real. You copy the whole guide, paste it into a chat, and ask the agent to “build a Rust app.” I’ve done it. The results are… not great. The model can’t tell what’s a hard rule versus a nice-to-have suggestion, so it cherry-picks whatever seems easiest and ignores the rest.

What I’ve found works much better is distilling your non-negotiable rules into persistent repo files that the agent loads automatically at the start of every session. Then you provide procedural knowledge (how to add an endpoint, how to write a migration) as discrete skills that the agent invokes when it needs them. Let’s walk through how to set this up.

Repo instruction structure

Most AI coding tools these days support persistent, file-based instructions that get loaded automatically when the agent starts a session:

Claude Code reads CLAUDE.md at the repo root, plus .claude/rules/ for scoped rules
Codex reads AGENTS.md at the repo root, plus nested AGENTS.md files in subdirectories
Cursor reads .cursor/rules/ or AGENTS.md
Aider uses a repo map to understand structure and can be given conventions files

The file names differ, but the idea is the same: short, specific, durable instructions that live in the repo and get loaded at the start of every session. That’s a much better home for your rules than a chat message you’ll forget to paste next time.

This repo includes a templates/ directory with ready-to-use instruction files. You can copy them into your project and adapt them to your codebase.

Root instructions

Your root AGENTS.md (or CLAUDE.md) should contain only the rules that apply to every session and every file. I’d keep it under 40 lines. Think of it as the stuff you’d tell a new team member on day one:

The project’s layer structure and dependency rule
Hard code style rules (no unwrap in handlers, newtypes for domain values, etc.)
The verification command that defines “done”
Pointers to nested instruction files for layer-specific rules

templates/
  AGENTS.md                          # Root rules (copy to your repo root)

Keep this file intentionally small. Everything that only applies to a specific layer goes into nested files, which we’ll look at next.

Nested layer instructions

Each architectural layer gets its own instruction file with rules specific to that layer. This way, the agent only loads the relevant guidance when it’s editing files in that directory. No need to flood it with database rules when it’s working on a handler.

templates/
  src/http/AGENTS.md                 # Handler, DTO, middleware, routing rules
  src/domain/AGENTS.md               # Model, service, port, error rules
  src/infra/db/AGENTS.md             # Repository, query, migration rules

The HTTP layer instructions cover handler length limits, extractor conventions, middleware ordering, and DTO rules. The domain layer instructions enforce our zero-framework-dependency rule and the newtype/constructor patterns we covered earlier. The database layer instructions cover query patterns, row-to-domain mapping, and migration conventions.

Skills for procedural workflows

Rules tell the agent what to do. Skills tell it how. You might wonder why the distinction matters. In my experience, agents are surprisingly good at following step-by-step procedures but terrible at inventing them from scratch. A skill is basically a recipe for a common task that the agent can follow when it needs to.

templates/
  .agents/skills/add-endpoint/SKILL.md    # Domain -> DB -> Handler -> Test flow
  .agents/skills/add-migration/SKILL.md   # Migration creation and verification
  .agents/skills/review-pr/SKILL.md       # Architecture and security review checklist

The add-endpoint skill walks through our full vertical slice: define domain types, create the migration, implement the repository method, add the service logic, write the DTO and handler, register the route, write tests, add tracing, and run verification. This is the procedure we want the agent to follow every time it adds a new API endpoint, and it mirrors exactly how we’d do it ourselves.

The review-pr skill is a checklist that covers architecture (dependency direction, handler thickness), correctness (no unwrap, proper error mapping), database (parameterized queries, TryFrom conversions), security (input validation, no secrets in logs), testing (happy path and error path), and operations (tracing, no blocking, outbound timeouts).

Checklists for human review

Even when an AI agent writes the code, you still need a human reviewing it. Don’t skip this part. The feature checklist gives you a structured path through that review:

templates/
  docs/checklists/feature.md              # End-to-end feature implementation checklist

The three-phase chat loop

Here’s the workflow I’ve settled on after a lot of trial and error. It’s a three-phase loop, and it works because it forces the agent to think in the same order we would: scope first, then implementation, then review.

Phase 1: Plan. Ask the agent to read the instruction files, summarize the architecture constraints that apply, propose the smallest vertical slice to implement, and list the files it plans to edit. Don’t let it write code yet. This might seem tedious, but it catches bad assumptions before they turn into bad code.

Phase 2: Implement. The agent implements one slice at a time. It stops after the code compiles and the tests pass, then shows you what changed and what’s left.

Phase 3: Review. The agent reviews its own diff against the review checklist. It flags architecture drift, leaked types, missing tests, blocking calls, and security issues. It suggests minimal fixes.

You’ll be surprised how much better the output gets compared to a single open-ended prompt. The constraint of working in phases keeps the agent from going off the rails and building something that compiles but violates half your architecture rules.

Promote corrections into repo memory

When the agent makes the same mistake twice, and it will (leaking a database type into the API, putting SQL in a handler, forgetting tracing instrumentation), don’t just correct it in chat. Go update the root or nested instruction file. The next session starts smarter because the correction is now part of the project’s persistent context.

This is where the real compounding value comes from. Over time, your instruction files evolve into a precise, battle-tested encoding of your team’s standards. What I’ve found is that after a few weeks, both humans and agents end up following the same rules consistently, which is kind of the whole point.

Getting started

Copy the templates/ directory contents into your Rust project
Rename AGENTS.md to CLAUDE.md if you use Claude Code (or keep both, they won’t conflict)
Adapt the rules to your specific project, adjusting layer paths and adding any project-specific conventions
Move the nested instruction files to match your actual src/ directory structure
Start a chat session and ask the agent to read its instruction files before doing any work

That last step is important. If you don’t ask the agent to read its instructions first, it’ll happily ignore them and do whatever it wants. Once you’ve got this workflow running, you’ll find that the agents become genuinely useful collaborators rather than code generators you have to babysit.

Crate Reference

Throughout this book, we pull in quite a few crates. If you ever find yourself thinking “wait, which crate was that again?”, this is the page to come back to. I’ve organized everything by concern so you can quickly find what you need and understand why it’s in our stack.

Core Framework

Crate	Purpose	Notes
axum	HTTP routing, extractors, and handler framework	The foundation. Use with the `macros` feature for `#[debug_handler]`.
tokio	Async runtime	Use `features = ["full"]` unless you need to minimize dependencies.
tower	Middleware abstractions and utilities	Provides `ServiceBuilder`, `ServiceExt` (for `oneshot` in tests), and composable layers.
tower-http	HTTP-specific middleware	CORS, compression, tracing, timeouts, request IDs, body limits, and more. Enable features selectively.
hyper	HTTP implementation	You’ll rarely interact with hyper directly when using Axum, but it’s the HTTP engine under the hood.

Serialization and Data

Crate	Purpose	Notes
serde	Serialization framework	Use `features = ["derive"]` for `#[derive(Serialize, Deserialize)]`.
serde_json	JSON serialization	Axum’s `Json` extractor uses this under the hood.
uuid	UUID generation and parsing	Use `features = ["v4", "serde"]` for random UUIDs with serde support.
chrono	Date and time handling	Use `features = ["serde"]` for serialization. Prefer `DateTime<Utc>` for timestamps.

Database

Crate	Purpose	Notes
sqlx	Async SQL with compile-time checking	Our go-to for most Axum projects. Supports PostgreSQL, MySQL, and SQLite.
diesel	Type-safe ORM and query builder	Mature and well-tested. Use `diesel-async` for async support.
sea-orm	ActiveRecord-style async ORM	Worth considering if raw SQL feels like too much ceremony for your use case.
axum-sqlx-tx	Request-scoped SQLx transactions	Begins a transaction for each request automatically, then commits or rolls back based on the response status.

Error Handling

Crate	Purpose	Notes
thiserror	Derive macros for custom error types	We use this for domain errors and our `AppError` enum where we need to match on variants.
anyhow	Flexible error type with context chaining	Great for the catch-all `Internal` variant and in infrastructure code. Don’t be shy with `.context()`.

Validation

Crate	Purpose	Notes
validator	Derive-based input validation	Supports `email`, `url`, `length`, `range`, and custom validators.
garde	Alternative validation with const generics	A newer take on `validator` with a different API style.
axum-valid	Pre-built validated extractors for Axum	Integrates with `validator`, `garde`, and `validify`. Saves you from writing your own extractors.

Authentication and Security

Crate	Purpose	Notes
jsonwebtoken	JWT encoding and decoding	The go-to for JWT-based auth in Rust.
argon2	Password hashing	What I’d recommend for password hashing today. Prefer it over bcrypt for new projects.
axum-login	Session-based authentication	A good fit for traditional web apps with server-side sessions.
axum-csrf-sync-pattern	CSRF protection	Implements the OWASP Synchronizer Token Pattern.
tower-governor	Rate limiting	Per-IP rate limiting using the governor algorithm.
tower-sessions	Session middleware	Server-side session management with pluggable backends.

Configuration

Crate	Purpose	Notes
config	Layered configuration loading	Supports files, environment variables, and multiple formats.
dotenvy	Load `.env` files	Fork of the original `dotenv` crate. Call `.ok()` instead of `.unwrap()` so it doesn’t blow up in production.
secrecy	Sensitive value protection	Wrap secrets in `SecretString`. It’ll redact them in debug output and zero memory on drop.

Observability

Crate	Purpose	Notes
tracing	Structured logging and spans	If you’re doing structured logging in Rust, this is what everyone reaches for.
tracing-subscriber	Tracing output configuration	Use `features = ["env-filter", "json"]` for environment-based filtering and JSON output.
tracing-opentelemetry	Bridge tracing to OpenTelemetry	Converts tracing spans into OpenTelemetry spans for distributed tracing.
opentelemetry	OpenTelemetry API	Core API for distributed tracing and metrics.
opentelemetry-otlp	OTLP exporter	Exports traces to Jaeger, Tempo, Datadog, and other OTLP-compatible backends.
axum-tracing-opentelemetry	Axum OpenTelemetry integration	Automatic trace context propagation for Axum.

API Documentation

Crate	Purpose	Notes
utoipa	Compile-time OpenAPI spec generation	Generates OpenAPI 3.0 specs from code annotations.
utoipa-swagger-ui	Swagger UI for utoipa	Serves an interactive API documentation UI at a configurable path.

Testing

Crate	Purpose	Notes
tower (ServiceExt)	Integration testing without a server	Use `oneshot()` to send requests through the router directly.
mockall	Automatic mock generation	Generates mock implementations of your traits for unit testing.
axum-test	Testing library for Axum	Gives you a `TestClient` with a nicer API than raw `oneshot`.

HTTP Client

Crate	Purpose	Notes
reqwest	Async HTTP client	The HTTP client you’ll see in most Rust projects. Async-native with connection pooling.

gRPC

Crate	Purpose	Notes
tonic	gRPC framework	Native async gRPC client and server. Integrates with Tower middleware.
tonic-prost	Tonic + Prost runtime	Runtime support for tonic services using prost-generated types.
prost	Protocol Buffers	Protobuf serialization/deserialization. Used by tonic for message types.
tonic-prost-build	Proto code generation	Compiles `.proto` files into Rust code at build time. Goes in `[build-dependencies]`. This replaced direct `tonic-build` usage as of tonic 0.14.
tonic-reflection	gRPC server reflection	Lets `grpcurl` and other tools discover your service’s API.

Async Utilities

Crate	Purpose	Notes
tokio-util	Extended Tokio utilities	Provides `CancellationToken` for structured shutdown, plus codecs and other helpers.
futures	Future combinators	Provides `FuturesUnordered`, `StreamExt`, and other utilities for working with async streams.

Background Jobs

Crate	Purpose	Notes
apalis	Background task processing	Type-safe, extensible, supports PostgreSQL/Redis/SQLite backends. Built-in monitoring and graceful shutdown.
fang	Persistent job queue	Workers in separate Tokio tasks with auto-restart on panic. Cron scheduling and retry backoff.
backie	Async task queue	PostgreSQL-backed, designed for horizontal scaling across multiple processes.

Anti-Patterns

Every Rust web project I’ve worked on has stumbled into the same handful of mistakes. Some of them I made myself, some I caught in code review, and a few I inherited in codebases that taught me the hard way. This chapter collects the ones I see most often, explains why they’ll hurt you, and points to the better alternative covered elsewhere in this book.

SQL in handlers

What it looks like: A handler function that contains sqlx::query! calls directly, managing transactions and mapping database errors right there in the HTTP handler body.

Why this hurts: It welds your HTTP layer to your database schema. You can’t test the handler without a database. You can’t reuse the query logic from a different entry point (a CLI tool, a background worker, another handler). And when the query changes, you’re editing a file that’s supposed to be about HTTP concerns.

What to do instead: Move database access into repository or query structs, as we describe in the Database Layer chapter. The handler calls a service or repository method and works with domain types. It never touches SQL or database-specific error types.

When this is actually fine: For small, CRUD-shaped services where each endpoint maps directly to one or two queries, and there’s no realistic prospect of a second entry point, handler-local SQL can be a legitimate simplification. The launchbadge/realworld-axum-sqlx reference implementation takes this approach deliberately, and its inline queries are well-structured and readable. The important thing is to make that choice on purpose rather than by default, and to recognize when a handler is accumulating enough query logic that it should be extracted. If you find yourself duplicating a query across handlers, or if a handler is orchestrating multiple queries that need to succeed or fail together, that’s your signal to introduce a repository or service layer.

Fat handlers

What it looks like: A handler that validates input, hashes passwords, queries the database, sends an email notification, publishes an event, and builds the HTTP response, all in one function.

Why this hurts: It makes the handler impossible to test in isolation, difficult to understand, and painful to modify. It also means that adding a second entry point for the same business logic (say, an admin endpoint or a message queue consumer) requires duplicating all of that logic.

What to do instead: Keep your handlers thin. Extract business logic into service methods in the domain layer. The handler’s job is to pull data from the request, call a service method, and return a response. If your handler is longer than 15 to 20 lines, it’s probably doing too much.

Using `unwrap()` in request paths

What it looks like: Calling .unwrap() or .expect() on a Result or Option inside a handler or any code that runs during request processing.

Why this hurts: A panic in a handler brings down that request with an opaque 500 error. Worse, if the panic happens while holding a mutex, it poisons the mutex and can cascade to other requests. The client gets no useful error information, and your logs may not capture the full context.

What to do instead: Always use the ? operator to propagate errors through the Result type. If you need to convert an Option to an error, use .ok_or(AppError::NotFound)? or something similar. Reserve unwrap() for situations where you can prove the value is always present (like a hardcoded regex) and expect() for startup code where a missing value means the application can’t run.

Leaking database types into the API

What it looks like: Returning the same struct from the database layer and the HTTP handler, or deriving both sqlx::FromRow and serde::Serialize on the same type and returning it directly as Json<UserRow>.

Why this hurts: It means that adding a database column automatically adds it to your API response. If that column contains sensitive data (a password hash, an internal flag, a soft-delete marker), it’s now visible to every API consumer. It also means that renaming or restructuring a database column becomes a breaking API change.

What to do instead: Maintain separate types for each boundary, as we describe in the Domain Modeling chapter. UserRow maps to the database. User represents the domain entity. UserResponse defines the API contract. Explicit From implementations bridge the gaps and make the conversions easy to audit.

Exposing internal errors to clients

What it looks like: Returning the raw error message from a database error, a file system error, or an internal service call directly in the HTTP response body.

Why this hurts: Internal error messages can contain database table names, file paths, SQL queries, stack traces, and other information that helps an attacker understand your system’s internals. Even without malicious intent, these messages confuse API consumers because they describe implementation details rather than the actual business problem.

What to do instead: Log the full error details server-side at the error level. Return a generic, user-facing message to the client. The AppError::Internal variant we describe in the Error Handling chapter handles this pattern cleanly.

Hardcoded secrets

What it looks like: A JWT secret or database password defined as a string literal in the source code, or committed to version control in a .env file.

Why this hurts: Anyone with access to the source code (or the git history) now has your production credentials. That includes every developer on the team, every CI system that checks out the repo, and anyone who finds the repo if it ever accidentally goes public.

What to do instead: Load secrets from environment variables, as we describe in the Configuration chapter. Use SecretString from the secrecy crate to protect them in memory. Add .env to .gitignore. Provide a .env.example file that documents the required variables without containing actual values.

Cross-layer imports

What it looks like: The domain module importing something from the api module (like an Axum extractor type) or from the infra module (like a PgPool).

Why this hurts: It breaks the dependency rule that’s the foundation of clean architecture. If the domain depends on Axum, you can’t use it in a CLI tool without pulling in the entire HTTP framework. If it depends on SQLx, you can’t test it without a database.

What to do instead: Dependencies point inward. The API layer depends on the domain. The infrastructure layer depends on the domain. The domain depends on nothing external. If the domain needs a capability from the outside world, it defines a trait (a port) that the infrastructure layer implements (as an adapter). We cover this in detail in the architecture chapters.

Blocking the async runtime

What it looks like: Calling CPU-intensive functions (like Argon2 password hashing), synchronous file I/O, or blocking library calls directly inside an async handler.

Why this hurts: Tokio’s runtime uses a small number of worker threads to run many async tasks cooperatively. A task that blocks a thread prevents all other tasks on that thread from making progress. Under load, this causes latency spikes and throughput degradation that are really hard to diagnose.

What to do instead: Use tokio::task::spawn_blocking() for CPU-intensive or blocking work. Use tokio::fs instead of std::fs. Make sure your database driver is truly async (SQLx and diesel-async are; the base Diesel is not).

String-typed domain values

What it looks like: Representing emails, usernames, and other domain values as plain String types throughout the application.

Why this hurts: A function that takes (String, String, String) for name, email, and password can be called with the arguments in any order, and the compiler won’t catch the mistake. You also lose the ability to enforce invariants (like “email must contain @”) in a single place, which leads to validation logic scattered all over the codebase.

What to do instead: Use the newtype pattern we describe in the Domain Modeling chapter. Define types like Email, UserName, and UserId that enforce their invariants at construction time. It might seem like extra boilerplate at first, but it pays for itself quickly.

Missing health checks

What it looks like: An application deployed to a container orchestrator with no health check endpoints, or with a liveness probe that checks external dependencies.

Why this hurts: Without health checks, the orchestrator has no way to know whether your application is actually functioning. It can’t restart a stuck process or stop routing traffic to an instance that’s lost its database connection. And a liveness probe that checks the database will cause unnecessary restarts when the database is temporarily down, which makes the situation worse, not better.

What to do instead: Implement separate liveness and readiness probes, as we describe in the Deployment chapter. The liveness probe should be unconditional. The readiness probe should check dependencies.

Over-engineering from the start

What it looks like: A new project with three workspace crates, trait-based repositories with generic services, a CQRS pattern, and an event bus, all for an application that has five CRUD endpoints.

Why this hurts: Premature abstraction adds complexity that slows you down without providing proportionate benefit. Every additional layer of indirection is another thing to understand, another place where bugs can hide, and another barrier to making changes. In my experience, the projects that ship fastest are the ones that start simple and add structure when they actually need it.

What to do instead: Start with the simplest structure that gives you clean separation, as we describe in the Project Structure chapter. Add abstraction when the complexity of the problem actually demands it, not in anticipation of complexity that may never arrive. You can always refactor from a flat structure to a layered one when the need becomes clear. Don’t worry about getting the architecture “perfect” on day one.

Resources

I wanted to put together the books, articles, repos, and community resources that shaped how I think about the topics in this guide. If you’re looking to go deeper on anything we covered, this is where I’d start.

Books

Design Patterns and Best Practices in Rust by Evan Williams (2025) is a solid walkthrough of idiomatic Rust patterns. It covers GoF patterns adapted for Rust, functional patterns, and architectural patterns you’ll actually use in real projects.

The Rust Spellbook (2026) is packed with hundreds of Rust tips and techniques. It’s great for discovering lesser-known features of the language, the standard library, and the toolchain that you might not stumble on otherwise.

Rust Web Development by yours truly (Manning). This is my earlier book on building web services in Rust, covering Warp and the broader ecosystem. If you want more context on how I think about these problems, that’s a good place to look.

Architecture and design articles

Master Hexagonal Architecture in Rust is, in my experience, the most thorough guide to implementing hexagonal architecture (ports and adapters) in Rust. It walks through domain modeling, trait-based ports, adapter implementations, dependency injection, testing strategies, and the trade-offs you’ll hit along the way. I leaned on this heavily when writing the architecture chapters in this guide.

A Rustacean Clean Architecture Approach to Web Development describes a four-layer architecture (models, persistence, routers, documentation) with a pragmatic take on framework coupling. I think it’s a nice complement to the hexagonal architecture guide, because it shows a simpler approach that works well when your app is mostly CRUD.

Rust, Axum, and Onion Architecture walks through implementing onion architecture with Axum. It covers the presentation, application, domain, and infrastructure layers with concrete code examples you can follow along with.

Building a Clean Rust Backend with Axum, Diesel, PostgreSQL and DDD shows domain-driven design patterns with Axum and Diesel. You’ll find the repository pattern, layered error handling, and custom extractors all in one place.

The Best Way to Structure Rust Web Services covers project organization patterns, from flat modules to workspace-based clean architecture. What I like about it is the practical advice on when to pick each approach.

Axum-specific guides

The Ultimate Guide to Axum (Shuttle) covers a lot of ground: handlers, routing, state management, custom extractors, middleware, testing with oneshot, and deployment. If you want one article that touches all the Axum basics, this is a good pick.

How to Build Production-Ready REST APIs in Rust with Axum takes you through the complete lifecycle of a production API: project setup, error handling, validation, authentication, middleware composition, graceful shutdown, and Docker deployment. It’s a good end-to-end reference.

Building High-Performance APIs with Axum and Rust focuses on performance, which we didn’t dig into much in this guide. It covers benchmarking and optimization strategies specific to Axum.

Error Handling in Axum (LogRocket) goes through the various approaches to error handling in Axum, from simple status code returns to custom error types with IntoResponse. If our error handling chapter left you wanting more options, start here.

Elegant Error Handling with IntoResponse (Leapcell) shows the thiserror plus IntoResponse pattern for centralized error handling. It’s a clean approach that I’ve found works well in practice.

Domain modeling and type-driven design

Using Types to Guarantee Domain Invariants by Luca Palmieri is one of my favorite articles on the newtype pattern and “parse, don’t validate” applied to Rust web applications. If those ideas clicked for you in our earlier chapters, this goes deeper.

Type-Driven Design in Rust and TypeScript compares type-driven approaches across both languages. If you’re coming from TypeScript (and many of us are), this helps bridge the mental models.

Rust Design Patterns is a community-maintained catalog of Rust design patterns, anti-patterns, and idioms. It includes the newtype pattern and plenty of other patterns that come up in web development. I find myself coming back to this one regularly.

Specific topics

Secure Configuration and Secrets Management in Rust covers the secrecy crate, environment variable management, and in-memory secret protection. If you want to go further than what we covered in our configuration chapter, this is a good next step.

How to Add Structured Logging to Rust HTTP APIs (techbuddies) walks through building a complete structured logging pipeline with correlation IDs and JSON output. Very practical, and you can follow along step by step.

Instrument Rust Axum with OpenTelemetry covers the full OpenTelemetry setup for Axum, including trace propagation and exporter configuration. Getting OTel right can be fiddly, so having a complete reference helps.

JWT Authentication in Rust (Shuttle) walks you through implementing JWT authentication with custom extractors, step by step.

Graceful Shutdown for Axum Servers covers signal handling, connection draining, and cleanup patterns. Graceful shutdown is one of those things you don’t think about until you need it, and then you really need it.

Health Checks and Readiness Probes in Rust for Kubernetes covers implementing liveness and readiness probes for container orchestrators. If you’re deploying to Kubernetes, you’ll want these.

Axum Backend Series: Models, Migrations, DTOs and Repository Pattern is a practical walkthrough of setting up the database layer with SQLx. It’s hands-on and gets you to working code quickly.

An Ergonomic Pattern for SQLx Queries in Axum describes the FromRef pattern for clean database access in handlers. Short read, but the pattern it shows is something I use all the time.

Example repositories

rust10x/rust-web-app is a production blueprint for Axum web applications. It’s maintained as a reference implementation and includes things like multi-tenancy support that you won’t find in most example repos.

launchbadge/realworld-axum-sqlx implements the RealWorld spec (a Medium clone API) using Axum and SQLx. I like it because it’s a complete, working example of a non-trivial API, which is hard to find.

JoeyMckenzie/realworld-rust-axum-sqlx is another RealWorld implementation, but with a different architectural approach. Comparing the two side by side is a great way to see how different design decisions play out in practice.

jeremychone-channel/rust-axum-course has the code from a thorough Axum video course. It covers everything from basics to production patterns, and the commit history is nicely structured if you want to follow along.

tokio-rs/axum/examples has over 30 official examples maintained by the Axum team. When I’m unsure how something is supposed to work, this is usually the first place I check. It covers everything from basic routing to WebSockets, testing, and graceful shutdown.

Ecosystem references

Axum ECOSYSTEM.md is the official catalog of over 70 community-maintained crates for Axum. Before you build something yourself, check here first. It covers authentication, middleware, validation, observability, and all sorts of specialized features.

Idiomatic Rust (mre) is a peer-reviewed collection of articles, talks, and repositories that teach concise, idiomatic Rust. It’s a rabbit hole, but a worthwhile one.

Rust API Guidelines has an extensive list of recommendations for designing idiomatic Rust APIs. It’s useful for both your internal module APIs and your public HTTP API, and it’s where I’d point anyone who asks “how should I design this interface?”

Keyboard shortcuts

Bulletproof Rust Web