Requirements & API: WhatsApp

What an interviewer expects you to nail down before drawing a single box.

Functional

  • Deliver a 1:1 message to an online recipient in real time over their persistent connection.
  • Reliably deliver to offline recipients: queue, wake the device via push, let it pull on reconnect.
  • Persist every message durably before attempting delivery. A crash must never lose a message.
  • Track presence (online/last-seen) and delivery/read receipts.

Non-functional

  • Sub-second delivery to online users; reliable (not instant) delivery to offline ones.
  • Durability before delivery: the Cassandra write must commit before the push is attempted.
  • Hold billions of long-lived connections: ~1M+ WebSockets per tuned gateway box.
  • Per-conversation message ordering must be preserved; losing a gateway box can't drop messages.

API contract

WS send: { conversation_id, client_msg_id, body } → ack { server_msg_id, ts }
client_msg_id makes retries idempotent; ack confirms durable persist, not delivery.
WS recv: server pushes { conversation_id, server_msg_id, sender, body, ts }
Delivered over the recipient's live WebSocket when presence shows them online.
GET /api/v1/messages?since={cursor} → { messages[], next_cursor }
Offline catch-up: device pulls pending messages after a push wakes it.

About WhatsApp

Send a WhatsApp message and it lands on your friend's phone in well under a second, or waits patiently if their phone is off. Doing that for billions of accounts is the core real-time messaging problem.

Unlike a normal website that answers a request and then hangs up, messaging needs the connection to stay open. Each online phone holds a long-lived WebSocket to a gateway server, and a single well-tuned Linux box can keep about a million of these open at once.

Here is the path of a message. The chat service writes it to durable storage first, usually Cassandra, before it even tries to deliver it. Durability comes before delivery, so a crash can never lose a message you already sent. Then it asks a presence service (Redis) which gateway your friend is connected to and pushes the message down that WebSocket.

What if they are offline? The message goes onto a queue (Kafka), and a push notification through APNs or FCM nudges their phone. When the phone wakes and reconnects, it pulls down everything waiting for it. This system teaches persistent-connection architecture, the gateway tier, durable message logs, presence tracking, and offline delivery, which are the building blocks of any real-time system.