← All learning pathsLearning path
Intermediate~93 min·4 systemsDatabase & Storage Deep Dive
From a distributed cache to an object store to a global event log — how data is stored, replicated, and retrieved at scale across fundamentally different storage engines.
After this path you will be able to
Explain the difference between a cache, a log, an object store, and a relational DB — and know which one to reach for given a workload's access patterns, consistency requirements, and scale.
Interview approach for this path
- 1.Start by characterizing the workload: read-heavy vs write-heavy, random vs sequential access, hot spots or uniform distribution.
- 2.Pick the right storage engine and explain why: relational for joins and transactions, key-value for sub-millisecond lookups, wide-column for time-series writes, object store for large blobs.
- 3.Address replication upfront: how many copies, sync or async, and what is the RPO if the primary dies?
- 4.Explain your sharding strategy: what is the partition key, why does it distribute load evenly, and how do you handle hot keys?
- 5.Discuss consistency requirements: does every read need the latest write, or is eventual consistency acceptable for this workload?
- 6.Address indexes: which columns, what type (B-tree vs LSM), and what is the write amplification cost?
Systems in this path
4 total1
Distributed CacheAdvanced·25 min
Consistent hashing ring, replication, hot key handling.
→2
Real-time Gaming LeaderboardIntermediate·18 min
Redis sorted sets for O(log n) rank queries; durable points in MySQL; top-N, your rank, players around you.
→3
Amazon S3 (Object Storage)Advanced·25 min
Multi-AZ erasure coding, sharded metadata, strong read-after-write.
→4
Apache KafkaAdvanced·25 min
Partitioned, replicated log. Brokers, ISR, consumer groups, leader failover.
→
Concepts reinforced throughout