What an interviewer expects you to nail down before drawing a single box.
GET /api/v1/foryou?cursor={cursor}&limit=10 → { videos[], next_cursor }POST /api/v1/videos { upload_id, caption } → { video_id, status: 'processing' }POST /api/v1/events { video_id, type, watch_ms } → 202Open TikTok and the very first video already feels chosen for you, and so does the next one, and the one after that. There is no follow graph deciding what you see. A ranking model picks each clip from the entire catalog based on what you have watched, liked, and skipped. The surprising part is that almost none of that machine learning happens while you scroll. The scroll itself is a fast lookup, and that split is what makes the feed feel instant.
Here is the whole thing in plain steps. When you open the app, the Feed Service grabs a list of video ids that were already picked for you and stored in Cassandra, then fills in the metadata and hands back a batch. Your client immediately starts pulling those video bytes from a CDN, and it prefetches the next two or three clips so the next swipe never shows a spinner. Meanwhile every view, like, and watch-time signal you produce gets fired off to Kafka.
The heavy work runs on its own clock. A Ranking Worker wakes up every few minutes, reads your recent activity off the Kafka stream, runs the ML model, and writes a fresh candidate list back to the store. By the time you scroll again, new picks are waiting. This is the same idea as a kitchen that preps ingredients before the dinner rush instead of starting from raw vegetables when each order arrives, so the line stays fast even when it is busy.
Uploads follow the same decoupling. When a creator posts, the Upload Service stores the raw file, drops a job on Kafka, and returns right away, so the upload feels instant even though a Transcode Worker spends minutes turning the video into about five bitrate variants for adaptive playback. The lesson TikTok teaches is the split between heavy offline ranking and a light online serving path, plus how a single event log can feed both a recommendation pipeline and a transcoding pipeline at billions of events a day.