Store the answer to expensive questions so you don't pay to compute it again.
Plain English: when an answer is expensive to compute, save it somewhere fast (like memory) so the next person asking gets it instantly. The hard part isn't saving; it's knowing when to throw the saved copy away.
Keeping frequently-accessed data in a faster store (RAM, edge, in-process) so subsequent requests don't have to recompute or refetch from the slow source of truth.
Reading from disk or computing a complex query is orders of magnitude slower than reading from memory. For read-heavy workloads, most reads can be served from cache, cutting load on the slow store and improving p99 latency. The reason it works: a few items get requested constantly while most sit untouched, the way a few songs get played millions of times while the rest are barely touched. Keep the popular ones in memory and the slow store barely sees the traffic.
Choose a cache pattern: cache-aside (app reads cache, falls through to DB on miss, populates cache), write-through (every write goes through the cache to the DB), write-behind (writes hit cache; DB is updated async), refresh-ahead (proactively refresh hot entries before they expire). Set a TTL appropriate to staleness tolerance.
Cache-aside pattern: Read Service checks Redis, falls through to Postgres on miss
Precomputed timelines in Redis, so readers almost never hit the DB
Hot (geohash, query) pairs cached with short TTL
The whole system IS a cache