What an interviewer expects you to nail down before drawing a single box.
POST /v1/notifications { user_id, type, payload, channels? } → { notification_id, status: "queued" }GET /v1/preferences/{user_id} → { email, sms, push, quiet_hours }PUT /v1/preferences/{user_id} { email, sms, push, quiet_hours } → 200When your order ships, a phone buzzes. Maybe an email lands too, but no text, because you turned SMS off months ago. Behind that simple moment is a whole platform whose job is to take one event from some internal service and turn it into the right messages, on the right channels, for the right person. A notification system looks easy until you realize every team in the company wants to send something, and every user wants different rules.
Here is the whole thing in plain steps. An internal producer, say the order service, POSTs a request to the API gateway, which authenticates it and rate-limits per producer. The Notification Service then looks up that user's preferences (email on, push on, SMS off) and creates one job per enabled channel, dropping each into Kafka and returning 200 immediately. Per-channel workers consume their own topics: the email worker calls SendGrid, the SMS worker calls Twilio, the push worker calls APNs or FCM, each retrying with backoff when a provider fails.
The reason for separate queues per channel is best seen with a traffic analogy. Imagine email, SMS, and push all sharing one lane on a highway. The moment the SMS provider slows to a crawl, every email and push behind it is stuck too. Giving each channel its own lane means a backed-up SMS provider never delays a password-reset email. Each channel scales and fails on its own.
Two decisions carry the design. First, preferences live in one service, not in each producer, so a user's 'no SMS at night' rule is honored no matter which team triggered the message, and those read-heavy prefs get cached in Redis to keep the hot path fast. Second, the producer's call returns the instant the jobs are queued, so its latency is never tied to Apple's push servers. Delivery is at-least-once with idempotency keys, because a rare duplicate is far better than a missing password reset. This system teaches asynchronous fan-out through a queue, per-channel isolation and independent scaling, centralized user preferences, and the build-vs-buy case for outsourcing delivery to providers like SendGrid and Twilio.