Reqflow
← All concepts
Sharding·3 min read

Sharding / Data Partitioning

Split one big dataset across N smaller stores so each one stays manageable.

Try it

Add users. Each is routed to a shard by the first letter of the name.

Shard 1A–H
0 rows
Shard 2I–Q
0 rows
Shard 3R–Z
0 rows

Sharding splits one giant table across machines by a shard key (here, the name's first letter). Each shard holds a slice of the data, so reads and writes spread out instead of hammering one box. The catch: queries that span shards (or a badly chosen key) get painful.

First time reading this? Start here

Plain English: when one database gets too big to fit on one machine, you slice the data into chunks (by user, by region, by some key) and put each chunk on its own machine. Each shard handles just its slice.

What it is

Horizontal partitioning of a single logical dataset across multiple physical stores. Each shard holds a slice of the keyspace; queries are routed to the right shard based on a partition key.

The problem it solves

A single database eventually hits a wall on IOPS, RAM, CPU, or storage. Sharding lets you go past that wall by adding more boxes, each handling a fraction of the data and load.

How it works

Pick a partition key (user_id, account_id, geo region) and a partition strategy: hash-based (key % N or consistent hashing, for even distribution), range-based (alphabetical, date ranges, giving locality but risk of hot ranges), or directory-based (a lookup table, flexible but adds a hop). All reads/writes route to the right shard.

Why use it

  • Scales beyond a single box's limits
  • Failure of one shard affects only its slice, not the whole dataset
  • Per-shard backups and migrations are smaller and faster

What it costs you

  • Cross-shard joins / transactions are painful and usually avoided by design
  • Resharding (changing N) is operationally expensive without consistent hashing
  • Hot shards happen: a celebrity user can saturate the shard that owns them

Where it shows up in our architectures

  • Instagram Feed

    Posts sharded by user_id; timeline cache sharded by user_id too

  • WhatsApp

    Message store (Cassandra) partitioned by conversation_id

  • Uber

    Geo index sharded by H3 cell; trips sharded by city

Gotchas

  • Pick the partition key carefully; it's the hardest decision to reverse. The key should distribute load evenly AND match how you query.
  • Avoid cross-shard queries. If you need them often, you picked the wrong key.
  • Plan for resharding from day one. Even consistent hashing has some cost; range-based sharding is a nightmare to rebalance.
When this went wrong in production

AWS S3 us-east-1 melts the internet · 2017

Postmortem ↗

One typo in a routine S3 maintenance command took down half the internet for 4 hours.

An engineer ran a debug subcommand to remove a small number of capacity servers from S3 us-east-1. A typo expanded the scope to a much larger set, including servers running the index subsystem and placement subsystem. S3 lost the index → every read started failing. Cascading failure: every AWS service that depended on S3 (which was most of them: Lambda, ECS, CloudWatch, even the AWS Console) degraded. Took 4+ hours to restart the index subsystem because it hadn't been restarted at scale in years; the cold-start path itself was the bottleneck. The lesson: capacity-management commands need scope validation, AND your critical recovery paths need to be exercised regularly so they don't atrophy.

Interview angle

Sharding questions test whether you understand partition key selection, because the key is essentially irreversible once data is live. Say upfront that you'd pick the key based on the dominant query pattern, then immediately address hotspots (celebrity users, popular items) and how you'd handle them. Candidates who propose sharding without naming a key and explaining why it distributes evenly are raising more red flags than they're answering.

Your notes

Private to you