Bluesky (AT Protocol): System Design

Requirements & API: Bluesky (AT Protocol)

What an interviewer expects you to nail down before drawing a single box.

Functional

•Let a user post, follow, and like, writing those records to their own Personal Data Server, signed with their keys.
•Aggregate every PDS's event stream into a single public firehose any consumer can subscribe to.
•Build per-user timelines and pluggable custom feeds from the firehose (third parties can run their own).
•Resolve portable identities (DID → current PDS) so following survives a PDS migration.

Non-functional

•Federation: no single central data store. Data lives in independently-operated PDSes, open via the AT Protocol.
•Account portability: a user can move PDS and keep handle + follow graph (following is by stable DID, not server).
•Open and pluggable: anyone can build a client, relay, AppView, or feed generator against the same protocol.
•Fast reads via precomputed timeline indexes (sub-100ms feeds) despite the source of truth being distributed.

API contract

com.atproto.repo.createRecord { repo, collection, record } → { uri, cid }

Writes a signed record (post, like, follow) to the user's own PDS.

com.atproto.sync.subscribeRepos (WS) → stream of repo events

The firehose: every public event across all PDSes, consumed by AppViews/feeds.

app.bsky.feed.getTimeline { cursor?, limit } → { feed[], cursor }

Served by an AppView from its precomputed per-user index; content hydrated from PDSes.

com.atproto.identity.resolveHandle { handle } → { did }

Handle → DID; the DID then resolves to the user's current PDS endpoint.

About Bluesky (AT Protocol)

Bluesky looks and scrolls just like Twitter, but underneath, your posts don't live in one company's database. They live in your own Personal Data Server, signed with your keys. The point is portability: you can move to a different server later and keep your handle and all your followers. The hard part is making a normal-feeling social network on top of storage that is spread across many independent servers.

Here is how it holds together. When you post, the record is written to your own PDS. Your PDS then streams that event to a Relay, which subscribes to every PDS in the network and merges them all into one public firehose of 'every event that just happened.' Read-side services called AppViews subscribe to that firehose, build a precomputed timeline index for each user, and answer your client's request for a home feed in under 100ms.

The reason the Relay exists is worth a simple picture. Without it, every feed builder and search index would have to keep a live connection to every single PDS, which is a tangled N times M mess. The Relay is like a newswire: thousands of reporters file to one wire service, and any newspaper subscribes to that single wire instead of calling every reporter. It turns an N times M problem into N plus M.

There is one more layer of indirection that makes portability work. You follow people by their DID, a stable decentralized identifier, not by their server address. A DID Resolver maps that DID to whichever PDS currently hosts them, so when someone migrates servers, your follow doesn't break. The cost of all this decentralization shows up at read time: hydrating a timeline can mean fetching post content back from many different PDSes, which AppView caches paper over until the cache goes cold.

This system teaches federation versus centralization, the PDS-Relay-AppView split, why a fan-out reducer like the Relay is necessary, identity indirection through stable DIDs for account portability, and the fundamental tradeoff of paying read-time fan-out to buy write-time data ownership.