← Concepts
Sharding·3 min read

Consistent Hashing

Distribute keys across N nodes such that adding or removing a node only reshuffles ~1/N of the keys.

First time reading this? Start here

Plain English: a way of dividing data across servers so that adding or removing a server only shuffles a small slice of the data, instead of forcing every server to rebuild its cache.

Used in:Distributed CacheAPI Rate LimiterInstagram Feed
What it is

A hashing scheme that maps both keys and nodes onto the same ring. Picture servers as seats around a round table. A key is owned by the next seat clockwise from where it lands. Adding or removing a seat only changes ownership of the keys near it, and everything else stays put.

The problem it solves

Regular modular hashing (key % N) seems fine until you add a node. Suddenly N becomes N+1 and almost every key gets remapped to a different node. For a cache, that means a near-total cache miss. For a sharded DB, it means a large rebalance. Both are operational nightmares. Consistent hashing reduces the churn from ~100% to ~1/N.

How it works

Hash each node id onto a fixed ring (typically 0 to 2^32). To find the owner of a key, hash the key the same way and walk clockwise until you hit a node. Adding a node means inserting it on the ring, so only keys between the new node and its anticlockwise neighbor move. In practice, each physical node is mapped to many 'virtual nodes' so the load spreads more evenly.

Why use it
What it costs you
Where it shows up in our architectures
Gotchas

Your notes

Private to you