Search & Discovery

How users find things: from the instant suggestions in a search box to the inverted index behind full-text search, location-based ranking, and the crawler that feeds it all.

After this path you will be able to

Design a complete search system end to end: from crawling and indexing content to serving ranked results with sub-100ms latency, including typeahead suggestions and geo-filtered queries.

Interview approach for this path

1.Clarify the search domain: full-text (inverted index), geo (spatial index), or structured faceted search? Different problems, different data structures.
2.Explain how data gets into the index: is there a crawler, a pipeline, or do writes flow directly? Address the indexing lag.
3.Describe your query pipeline: tokenization, normalization, and ranking signal. Interviewers want to hear that relevance is not just keyword matching.
4.For typeahead, explain the data structure: a trie or prefix index in Redis, served from a dedicated low-latency tier, not the main search index.
5.Address scale: a single Elasticsearch node won't handle billions of documents, so explain how you'd shard the index and why.
6.Mention caching: popular queries are cacheable for seconds or minutes, and caching at the search tier dramatically reduces index load.

Systems in this path

4 total

Concepts reinforced throughout

Bloom Filter Caching Consistent Hashing Database Indexes Sharding / Data Partitioning

Up next

Reliability & Resilience

Systems where failure is not an option: schedulers that never drop a job, payment flows that cannot charge twice, and a digital wallet that never loses or doubles money mid-transfer.

→