What an interviewer expects you to nail down before drawing a single box.
GET /api/v1/home → { rows: [{ title, items[] }] }POST /api/v1/playback { title_id, device_profile } → { manifest_url, token, expires_in }GET {cdn_edge}/chunk/{id} → video bytesWhen you press play on Netflix, the video has to start in a second or two and never stall, even though hundreds of millions of people are watching at the same moment. That one requirement, bandwidth, drives the entire design. With ~200M concurrent viewers each pulling several Mbps, total outbound traffic reaches the petabit-per-second range, far more than any single origin could ever serve.
The answer is to push the video as close to viewers as possible with a CDN. Netflix took this to the extreme and built its own, called Open Connect: caching boxes installed directly inside ISP networks, so most bytes come from a server a few miles from your house instead of a distant data center.
The video itself is prepared ahead of time. Each title is transcoded once into a ladder of bitrates, from low quality up to 4K. While you watch, the player uses adaptive bitrate streaming to switch up or down depending on your network, the way a car shifts gears, so playback stays smooth instead of freezing.
It helps to split the system in two. The control plane (browse, search, recommendations, playback authorization) is tiny compared to the data plane (the actual video bytes). Recommendations also decide what most people will watch, which tells Netflix what to pre-load into each region's caches before anyone asks for it. This system teaches CDN strategy, the build-vs-buy economics of running your own CDN, adaptive bitrate streaming, transcoding pipelines, and the control-plane/data-plane split.