Reqflow
← All learning paths
Learning path
Intermediate~79 min·4 systems

Search & Discovery

How users find things: from the instant suggestions in a search box to the inverted index behind full-text search, location-based ranking, and the crawler that feeds it all.

After this path you will be able to

Design a complete search system end to end: from crawling and indexing content to serving ranked results with sub-100ms latency, including typeahead suggestions and geo-filtered queries.

Interview approach for this path
  1. 1.Clarify the search domain: full-text (inverted index), geo (spatial index), or structured faceted search? Different problems, different data structures.
  2. 2.Explain how data gets into the index: is there a crawler, a pipeline, or do writes flow directly? Address the indexing lag.
  3. 3.Describe your query pipeline: tokenization, normalization, and ranking signal. Interviewers want to hear that relevance is not just keyword matching.
  4. 4.For typeahead, explain the data structure: a trie or prefix index in Redis, served from a dedicated low-latency tier, not the main search index.
  5. 5.Address scale: a single Elasticsearch node won't handle billions of documents, so explain how you'd shard the index and why.
  6. 6.Mention caching: popular queries are cacheable for seconds or minutes, and caching at the search tier dramatically reduces index load.

Systems in this path

4 total
  1. 1
    Typeahead / Autocomplete
    Intermediate·18 min

    Precomputed trie cache in Redis, debounced prefix queries, batch + streaming frequency pipeline.

  2. 2
    Search Engine
    Advanced·25 min

    Web crawl, inverted index, scatter-gather query, ranking.

  3. 3
    Yelp (Location-Based Search)
    Intermediate·18 min

    Geohashing, nearby search, review writes, hot-query caching.

  4. 4
    Web Crawler
    Intermediate·18 min

    URL frontier, politeness delays, Bloom-filter dedup, distributed fetching at billions of pages.

Concepts reinforced throughout

Up next

Reliability & Resilience

Systems where failure is not an option: rate limiters that protect your service, schedulers that never drop a job, and payment flows that cannot charge twice.