Caddy: The Edge Proxy and TLS Terminator

The Role of a Reverse Proxy

In your architecture, Caddy serves as the edge proxy—the only service that directly faces the internet. Every HTTP request to your server hits Caddy first. This design pattern provides several crucial benefits:

Single Point of TLS Termination: Caddy handles all SSL certificate management. Your backend services speak plain HTTP on localhost. They don't need to know about certificates, domain names, or TLS configuration. This dramatically simplifies service configuration—you only configure TLS once, in Caddy.

Unified Access Control: Rate limiting, IP filtering, CORS headers, and security policies are all implemented in one place. You don't have to configure these in every service.

Service Isolation: Backend services bind only to localhost (127.0.0.1). They're not accessible from the internet directly. If someone discovers a port is open, they can't bypass Caddy's protections.

Easy Service Swapping: Want to replace a backend service? Just change where Caddy proxies to. DNS and public-facing URLs stay the same. Users never know you switched from one implementation to another.

How Caddy Fits into NixOS

Caddy in NixOS follows the standard service module pattern. The NixOS Caddy module exposes options that let you configure:

Which Caddy package to use (important for plugins)
Global configuration (settings that apply to the entire server)
Virtual hosts (per-domain configuration)
Log settings
Administrative endpoints

The module translates your Nix configuration into a Caddyfile—the native text format Caddy uses. When you run nixos-rebuild switch, NixOS generates this file and tells Caddy to reload its configuration.

The Caddyfile Structure

Understanding Caddy's native configuration format helps you understand what the NixOS module is doing. A Caddyfile has a simple structure:

Global options appear at the top, outside any site block. These apply server-wide.

Site blocks start with the domain name (or names) they handle. Everything indented under a site block configures that site.

Directives are instructions like reverseproxy, fileserver, or header. They can have parameters and sub-blocks.

Matchers are conditions that determine when a directive applies. In Caddy, these often start with @.

Virtual Hosts and Domain Mapping

A virtual host is a domain (or set of domains) that Caddy handles. When a request comes in, Caddy looks at the Host header and matches it against configured virtual hosts.

You can configure multiple domains to point to the same virtual host configuration. This is useful for apex domains and www subdomains, or for short aliases.

Caddy automatically obtains certificates for all domains in a virtual host. The first time a request comes in for a new domain, Caddy talks to Let's Encrypt, proves you control the domain (via HTTP challenge), and obtains a certificate. This happens automatically—you don't configure certificates manually.

TLS and Certificate Management

Caddy's automatic HTTPS is one of its standout features. Here's what happens:

First Request: When Caddy receives an HTTPS request for a domain it doesn't have a certificate for, it initiates the ACME protocol with Let's Encrypt. It serves a special file at /.well-known/acme-challenge/ to prove domain control. Once verified, Let's Encrypt issues a certificate, which Caddy stores and begins using.

Renewal: Caddy tracks certificate expiration. Before a certificate expires, it automatically initiates renewal. This happens in the background without downtime.

Storage: Certificates are stored in Caddy's data directory (in your configuration, /var/lib/caddy). This persists across reboots but is machine-specific.

On-Demand TLS: Standard automatic HTTPS works for domains you know in advance. But your PDS needs wildcard certificates for user subdomains (*.pds.snek.cc) where you don't know all the subdomains ahead of time.

On-demand TLS solves this. When a request comes in for an unknown subdomain, Caddy asks your backend service "is this a valid subdomain?" via an HTTP request. If the backend confirms (returns 200 OK), Caddy obtains a certificate on the spot. If the backend denies (returns non-200), Caddy rejects the request.

This is configured in the global block with ondemandtls and the ask directive pointing to your PDS's validation endpoint.

The Reverse Proxy Directive

The reverse_proxy directive is the workhorse of your configuration. It forwards incoming requests to a backend service.

In its simplest form, you just specify the backend URL:

reverse_proxy http://127.0.0.1:8080
}}}

But it supports many options:

*Transport configuration* lets you customize how Caddy connects to the backend:
- Timeouts (how long to wait for connections and responses)
- Keepalive settings (how long to keep connections open for reuse)
- TLS settings (if the backend requires HTTPS)

*Header manipulation* lets you modify headers before sending to the backend or after receiving the response:
- `header_up` modifies headers going to the backend
- `header_down` modifies headers coming from the backend

This is crucial for things like stripping the Origin header (which can cause WebSocket issues) or adding X-Forwarded-* headers so the backend knows about the original request.

*WebSocket support* is automatic. If a request has WebSocket upgrade headers, Caddy handles the protocol upgrade and maintains the connection. The transport settings (especially `read_timeout 0` and `write_timeout 0`) are important for long-lived WebSocket connections.

== Rate Limiting with Plugins ==

Standard Caddy doesn't include rate limiting. You need the `caddy-ratelimit` plugin, which is why your configuration uses `pkgs.caddy.withPlugins` to build a custom Caddy binary that includes this plugin.

Rate limiting configuration uses several concepts:

*Matchers* define when the rate limit applies. Your configuration uses `@api_limit` which matches requests NOT from localhost. This means local services can make unlimited requests, but external requests are limited.

*Zones* group requests for rate limiting purposes. In your case, all matched requests share the same limit counter.

*Rate* specifies the limit: `10r/s` means 10 requests per second. The `burst 20` allows temporary spikes—up to 20 requests can come in quickly, but then the rate must drop to 10/s.

*Key insight:* Rate limiting happens at Caddy, before requests reach your backend services. This protects your services from being overwhelmed by traffic.

== Headers and Security ==

HTTP headers are a critical security mechanism. Your configuration sets several security headers:

*X-Frame-Options: SAMEORIGIN* prevents your site from being embedded in iframes on other domains. This stops clickjacking attacks.

*X-Content-Type-Options: nosniff* tells browsers not to guess the content type. If you say something is text/plain, the browser won't try to execute it as JavaScript even if it looks like code.

*X-XSS-Protection: 1; mode=block* enables browser XSS filtering (older browsers that support it).

*Referrer-Policy: strict-origin-when-cross-origin* controls what information is sent in the Referer header when navigating from your site to others. This prevents leaking sensitive URL information.

These headers are added to every response, providing baseline security for all your services.

== Static File Serving ==

For domains that serve static content (your main website, pdsls, atproto-nix.org), you use the `file_server` directive. This serves files from a directory.

The `root` directive sets the base directory. Requests for `/foo/bar` look for files at `/var/www/snek.cc/foo/bar`.

*Cache headers* are important for static assets. You set `Cache-Control: public, max-age=31536000, immutable` for assets that never change (fonts, hashed JavaScript bundles). This tells browsers and CDNs to cache these for a year.

For content that changes occasionally (like your headers page), you use a shorter cache time (86400 seconds = 1 day).

== The Helper Function Pattern ==

Your configuration uses helper functions (`mkRateLimitedProxy`, `mkCommonCaddyHeaders`) to generate Caddyfile snippets. This is a powerful pattern that reduces duplication and makes changes easier.

When you write a function like `mkRateLimitedProxy`, you're writing Nix code that returns a string of Caddyfile syntax. At build time, Nix calls these functions and substitutes the results into your configuration.

This is configuration generation—you're not writing the Caddyfile directly, you're writing a program that generates it. The benefit is that common patterns are defined once and reused. If you need to change your rate limiting approach (different limits, different matchers), you change it in one place.

== WebSocket Proxying ==

Several of your services use WebSockets (the relay, PDS, some ATProto services). WebSocket connections are different from regular HTTP:

- They're long-lived (stay open for minutes or hours)
- They're bidirectional (both client and server can send data)
- They start as HTTP requests but "upgrade" to a different protocol

Caddy handles all of this automatically. When it sees WebSocket upgrade headers, it manages the protocol switch and maintains the connection.

*Transport settings for WebSockets:*
- `read_timeout 0` and `write_timeout 0` disable timeouts. For long-lived connections, you don't want Caddy closing them for inactivity.
- `keepalive` settings maintain TCP connections through proxies
- `flush_interval -1` disables response buffering, important for real-time WebSocket data

== Redirects and Rewrites ==

Your configuration includes a redirect from `ss.snek.cc` to `slingshot.snek.cc`. This uses the `redir` directive.

The `permanent` flag means Caddy sends a 301 redirect (permanent redirect), which browsers remember and future requests go directly to the new URL.

The `{uri}` placeholder includes the original request path, so `ss.snek.cc/foo/bar` redirects to `slingshot.snek.cc/foo/bar`.

== Caddy Plugins and Custom Builds ==

Caddy's plugin system allows extending its functionality. The rate limiting plugin adds the `rate_limit` directive.

Since plugins aren't in the standard Caddy binary, you need to build a custom one. The `pkgs.caddy.withPlugins` function does this:

1. It takes a list of plugins (specified as Go import paths with versions)
2. It downloads the Caddy source code
3. It builds Caddy with those plugins included
4. It caches the resulting binary

The `hash` parameter is a content hash that Nix uses for reproducibility. If the plugin changes, the hash changes, and Nix knows to rebuild.

== Logging and Observability ==

Caddy can log requests to files or other destinations. Your configuration logs to `/var/log/caddy/access.log`.

These logs are invaluable for debugging and monitoring. They show:
- What domains are being requested
- Response times
- Status codes
- Client IPs (behind any proxies)

You can import these logs into monitoring systems or analyze them for traffic patterns.

== Why This Architecture Scales ==

The edge proxy pattern scales well because:

*Horizontal scaling:* You can run multiple backend instances and use Caddy's load balancing (adding `lb_policy` to the reverse_proxy directive).

*Service independence:* Each backend service is a separate concern. They can be written in different languages, deployed independently, and scaled separately.

*Security boundary:* Caddy is your security boundary. Everything behind it is trusted (on localhost). You can focus security efforts on Caddy configuration.

*Operational simplicity:* Want to take a service down for maintenance? Just comment out its virtual host in Caddy. Users get a 502 error instead of connection refused, which you can customize with a nice maintenance page.

== Common Pitfalls ==

*Port conflicts:* Services often use the same default ports (3000, 8080). You must ensure each service uses a unique port that Caddy proxies to.

*Localhost vs 0.0.0.0:* Services should bind to localhost (127.0.0.1) so they're only accessible through Caddy. If a service binds to 0.0.0.0, it's accessible directly, bypassing Caddy's protections.

*Certificate limits:* Let's Encrypt has rate limits (50 certificates per domain per week). On-demand TLS with many subdomains can hit these limits. Production deployments often use wildcard certificates or commercial certificates instead.

*Path matching:* Caddy matches requests to virtual hosts based on the Host header. If you have overlapping domains (like `*.snek.cc` and `foo.snek.cc`), order matters. More specific matches should come first.

== Integration with NixOS ==

The NixOS Caddy module handles:
- Installing the Caddy binary
- Creating the systemd service
- Generating and updating the Caddyfile
- Managing the data directory for certificates
- Setting up log directories

When you change Caddy configuration and run `nixos-rebuild switch`, NixOS:
1. Generates a new Caddyfile
2. Validates it (catches syntax errors before applying)
3. Reloads Caddy (graceful, doesn't drop connections)

This is the power of declarative configuration: you describe the desired state, and NixOS figures out how to get there safely.

== References ==

- [[https://caddyserver.com/docs/|Caddy Documentation]] - Official Caddy documentation
- [[https://caddyserver.com/docs/quick-starts/reverse-proxy|Reverse Proxy Quick Start]] - Caddy reverse proxy guide
- [[https://caddyserver.com/docs/caddyfile/directives/reverse_proxy|reverse_proxy directive]] - Complete directive reference