← Concepts
Distributed Systems·3 min read

Quorum

Require N/2+1 nodes to agree on every operation, so the system stays consistent even when some nodes are down.

First time reading this? Start here

Plain English: when you have 5 copies of the data, don't trust any single one. Require at least 3 of them to agree on every read and every write. That way you can lose any 2 machines and the survivors still tell the truth.

Used in:WhatsAppDistributed Cache
What it is

A consensus mechanism for distributed systems: an operation is considered successful only if a majority of nodes agree. Variants: read quorum (R), write quorum (W), total replicas (N). The classic rule R+W>N guarantees read-your-writes.

The problem it solves

In a distributed system, nodes can fail or be partitioned. If you require all nodes to agree, any single failure halts you. If you require only one, you risk split-brain (two halves of a partition both claiming to be authoritative). A quorum strikes a balance: tolerate up to N/2 failures while staying consistent.

How it works

On write, send to N replicas; wait for W acks before returning success. On read, query R replicas; the value with the latest version wins. If R+W>N, any read overlaps with any write, so you always see the latest committed value. Used by Raft, Paxos, Dynamo, and Cassandra (with tunable consistency levels).

Why use it
What it costs you
Where it shows up in our architectures
Gotchas
When this went wrong in production

GitHub 24-hour partition · 2018

Postmortem ↗

A 43-second network partition triggered 24 hours of data inconsistency.

A 43-second network partition between GitHub's US-East and US-West data centers caused MySQL clusters in both regions to elect themselves primary (split-brain). When the partition healed, both regions had accepted writes and now had divergent state. GitHub chose consistency over availability: they took the service degraded for 24+ hours while they manually reconciled the diverged writes across clusters. The lesson: CAP isn't a textbook curiosity. When the partition heals, you've already made the C-vs-A choice. Your reconciliation strategy IS your CAP choice expressed in code.

Your notes

Private to you