Requirements & API: Instagram Feed

What an interviewer expects you to nail down before drawing a single box.

Functional

  • Let a user post photos/videos with captions; media stored durably and served fast.
  • Render a personalized, ranked home feed of posts from accounts a user follows.
  • Support infinite scroll with stable pagination as new posts arrive.
  • Propagate a new post to followers' feeds within seconds (not necessarily instantly).

Non-functional

  • Reads dominate writes ~100:1. Opening the feed must be a near-instant cache read.
  • Feed open p99 in the low hundreds of ms; media first-byte sub-100ms worldwide via CDN.
  • Eventual consistency is fine: a post appearing in followers' feeds a few seconds late is acceptable.
  • Must absorb the celebrity fan-out problem: one post can target 100M+ follower timelines.

API contract

GET /api/v1/feed?cursor={cursor}&limit=20 → { posts[], next_cursor }
The hot path. Served from the precomputed Redis timeline, hydrated from Postgres.
POST /api/v1/posts { media_upload_id, caption } → { post_id }
Returns immediately; fan-out happens async via Kafka.
POST /api/v1/media { content_type } → { upload_url, media_upload_id }
Pre-signed S3 upload; bytes never transit the API.

About Instagram Feed

Open Instagram and your feed is just there, instantly, even though it is built from posts by everyone you follow. How it gets there is the classic feed design question, and it comes down to one fork: do you build the feed when someone opens the app (fan-out on read), or do you build it ahead of time when posts are made (fan-out on write)?

Instagram mostly does the second. When you post, the system writes your post ID into the precomputed timeline of each of your followers, kept in a Redis cache. Opening the feed is then just a fast cache read. This trade makes sense because people open their feed far more often than they post. You do the expensive work on the rare event so the common event stays cheap.

There is one famous catch, the celebrity problem. A user with 100M followers would trigger 100M cache writes for a single post. So real systems go hybrid: fan out on write for normal accounts, and fan out on read for the handful of accounts with enormous followings. Their posts get pulled in when a follower opens the app instead.

The heavy media (photos and video) does not live in the timeline at all. It sits in object storage like S3 and is delivered through a CDN, while a database holds the lightweight post metadata. This system covers fan-out strategies, caching, CDN usage, and the read/write ratio reasoning that sits underneath every feed at scale.