← Back to Guide Index

Lycan: Custom Feed Generation for Bluesky

What is a Feed Generator?

On Bluesky, your home timeline isn't the only way to see content. Feeds are customizable algorithms that select and order posts. When you subscribe to a feed in a Bluesky client, you're asking a feed generator to provide you with a curated list of posts.

The default feeds ("Following", "Discover") are provided by Bluesky's official AppView. But anyone can build and host custom feeds. Want a feed of only cat photos? Posts from people within 10 miles of you? Content in a specific language? All possible with custom feed generators.

Lycan is one such feed generator—it listens to the ATProto firehose, indexes posts that match certain criteria, and provides an API that Bluesky clients can query.

The Feed Generator Architecture

Feed generators operate on a publisher-consumer model:

Firehose (Publisher): Streams every post on the network

Feed Generator (Consumer/Processor): Listens to the firehose and builds indexes

Bluesky AppView (Consumer): Queries the feed generator for posts to display

Client (Display): Shows the feed to users

This separation means:

How Lycan Works

Lycan follows a three-stage pipeline:

Stage 1: Firehose Consumption

Like Spacedust, Lycan connects to a Jetstream endpoint and consumes the firehose. But unlike Spacedust, which indexes everything, Lycan is selective—it only cares about posts (not likes, follows, profile updates, etc.).

Lycan maintains a persistent connection and uses cursors to resume if disconnected. The jetstreamHost setting specifies which regional endpoint to connect to.

Stage 2: Filtering and Storage

When Lycan receives a post, it applies filters to decide whether to include it in feeds. Filters might check:

Posts that pass the filter are stored in PostgreSQL. The database schema is optimized for the types of queries feed algorithms need—often involving time ranges, author lookups, and full-text search.

Stage 3: Feed API

Lycan exposes HTTP endpoints that the Bluesky AppView calls when a user requests a feed. The endpoint:

1. Authenticates the request (verifies the user is who they claim)

2. Looks up the feed configuration (what algorithm to apply)

3. Queries the database for matching posts

4. Sorts according to the algorithm (chronological, engagement-based, etc.)

5. Returns a list of post URIs

Important: The feed generator doesn't return the full post content—just URIs (references to posts). The AppView then fetches the actual post content from the appropriate PDS instances. This separation prevents feed generators from being overwhelmed with serving full post data.

Feed Algorithms

The "algorithm" is the heart of a feed generator. It's the logic that decides what appears in the feed and in what order.

Chronological Feed

The simplest algorithm—posts appear in the order they were published. This is what users typically expect from a "Following" feed.

Implementation: Query posts by timestamp, return most recent N.

Engagement-Based Feed

Posts are ranked by predicted engagement (likes, replies, reposts). More engaging posts appear higher.

Implementation: Calculate engagement scores, sort by score, return top N.

Challenges:

Reverse-Chronological with Quality Filter

Like chronological, but filters out low-quality posts (spam, duplicates, very short posts).

Implementation: Apply heuristics to filter, then sort by time.

Geographic Feed

Posts from users near a specific location.

Implementation: Filter by geolocation data (if available), sort by time.

Challenges:

Interest-Based Feed

Posts matching topics the user has shown interest in.

Implementation: Track user interactions, build interest profile, filter posts by relevance.

Challenges:

Composite Feeds

Combine multiple algorithms—some posts by time, some by engagement, mix in trending topics.

Implementation: Run multiple queries, merge and interleave results.

The Database Layer

Lycan uses PostgreSQL for data storage. The schema typically includes:

posts table:

authors table:

interactions table:

feed_cursors table:

This schema supports the queries feed algorithms need while being compact enough to handle millions of posts.

Indexing Strategy

Efficient indexing is crucial for feed performance:

Time-based queries: Index on created_at for chronological feeds

Author queries: Index on author_did for author-specific feeds

Full-text search: GIN index on text for keyword matching

Geographic queries: PostGIS extension and spatial indexes (if using location)

Without proper indexes, feed queries become slow as the database grows.

PostgreSQL Integration with NixOS

The NixOS module for Lycan can manage PostgreSQL automatically:

When database.createLocally = true:

1. The Lycan module ensures PostgreSQL is enabled

2. It creates a database named "lycan"

3. It creates a user named "lycan" with ownership

4. It configures Lycan to connect via Unix socket

This is convenient for single-server setups. For production or high-scale deployments, you'd manage PostgreSQL separately and provide connection details to Lycan.

Connection management: Lycan maintains a connection pool to PostgreSQL. The pool size is configured based on expected concurrency. Too few connections and requests queue up; too many and you waste resources.

Authentication and Authorization

When the Bluesky AppView queries a feed, it includes authentication proving which user is requesting the feed. Lycan validates this:

1. Extract JWT from the request

2. Validate the JWT signature (using the PDS's public key)

3. Extract the user's DID from the JWT

4. Use this DID to personalize the feed (if the algorithm supports personalization)

Why personalization matters: Some feeds show the same content to everyone (global feeds). Others show different content based on who you are (your personalized "Following" feed). The authentication lets Lycan know who you are for personalized feeds.

Feed Registration

For a feed to appear in Bluesky clients, it must be registered:

1. You create a record on your PDS announcing the feed

2. The record includes the feed's name, description, and the URL of the feed generator

3. Bluesky's AppView indexes this record

4. The feed appears in clients for users to subscribe to

This is a one-time setup. After registration, the feed is discoverable.

Scaling Considerations

Feed generators face interesting scaling challenges:

Write scaling: Consuming the firehose is a write-heavy workload. As the network grows, more posts per second need to be indexed.

Read scaling: Popular feeds get many queries. A feed with 100,000 subscribers might get thousands of queries per second.

Computational scaling: Complex algorithms (machine learning models, large graph traversals) are CPU-intensive.

Strategies:

The User Agent Header

Lycan identifies itself to Jetstream with a user agent string. This is useful for:

Your configuration sets this to "Lycan (@snek.cc)" which clearly identifies your instance.

Allowed Hosts

The allowedHosts setting specifies which domains Lycan should accept requests for. This is a security measure—if someone tries to query your feed generator with a spoofed Host header, it will be rejected unless the host is in the allowed list.

Your configuration includes both "lycan.snek.cc" and "ly.snek.cc" (the short alias).

Monitoring Feed Health

Key metrics to monitor:

Ingestion lag: How far behind real-time is indexing? Should be near zero.

Query latency: How long do feed API calls take? Should be under 100ms for good user experience.

Database size: How much disk is the index using? Plan for growth.

Error rates: Failed queries, database connection issues, authentication failures.

Feed freshness: How recent are the posts being returned? Stale feeds indicate indexing problems.

Integration with NixOS

The NixOS Lycan module:

The service runs as a dedicated user with limited privileges.

When you change Lycan configuration and rebuild:

1. NixOS updates the service definition

2. If database settings changed, those are updated

3. Lycan restarts with new configuration

4. It resumes consuming the firehose from its last cursor position

Common Issues and Debugging

"Feed not updating": Usually means Lycan has fallen behind on the firehose or stopped consuming. Check logs for connection errors.

"Slow feed loading": Database query performance issue. Check indexes, query plans, and database load.

"Empty feed": Either the filters are too restrictive (no posts match) or indexing has stopped (check firehose connection).

"Authentication errors": JWT validation failing. Check that Lycan can reach your PDS for key resolution.

"Database connection pool exhausted": Too many concurrent requests or connections not being released. Check connection pool size and for connection leaks.

Feed Generator as a Service

Running a feed generator is providing a service to the Bluesky ecosystem:

But it also comes with responsibilities:

Lycan makes it relatively easy to run a feed generator, but the algorithm design and ongoing maintenance require thought and attention.