← All learning pathsLearning path
Intermediate~79 min·4 systemsSearch & Discovery
How users find things: from the instant suggestions in a search box to the inverted index behind full-text search, location-based ranking, and the crawler that feeds it all.
After this path you will be able to
Design a complete search system end to end: from crawling and indexing content to serving ranked results with sub-100ms latency, including typeahead suggestions and geo-filtered queries.
Interview approach for this path
- 1.Clarify the search domain: full-text (inverted index), geo (spatial index), or structured faceted search? Different problems, different data structures.
- 2.Explain how data gets into the index: is there a crawler, a pipeline, or do writes flow directly? Address the indexing lag.
- 3.Describe your query pipeline: tokenization, normalization, and ranking signal. Interviewers want to hear that relevance is not just keyword matching.
- 4.For typeahead, explain the data structure: a trie or prefix index in Redis, served from a dedicated low-latency tier, not the main search index.
- 5.Address scale: a single Elasticsearch node won't handle billions of documents, so explain how you'd shard the index and why.
- 6.Mention caching: popular queries are cacheable for seconds or minutes, and caching at the search tier dramatically reduces index load.
Systems in this path
4 total1
Typeahead / AutocompleteIntermediate·18 min
Precomputed trie cache in Redis, debounced prefix queries, batch + streaming frequency pipeline.
→2
Search EngineAdvanced·25 min
Web crawl, inverted index, scatter-gather query, ranking.
→3
Yelp (Location-Based Search)Intermediate·18 min
Geohashing, nearby search, review writes, hot-query caching.
→4
Web CrawlerIntermediate·18 min
URL frontier, politeness delays, Bloom-filter dedup, distributed fetching at billions of pages.
→
Concepts reinforced throughout