Pick exactly one node to coordinate, and re-pick safely when it dies, without ever ending up with two leaders.
Kill the leader. The remaining nodes elect a new one.
N1 is the leader for term 1.
Many systems need exactly one node in charge (to coordinate writes or assign work). When that leader dies, the survivors run an election and pick a new one in a higher term, so the cluster keeps a single coordinator without anyone stepping in. Algorithms like Raft formalize this.
Plain English: some jobs need exactly one node in charge (one writer, one scheduler). Leader election is how a cluster agrees on who that is, and how it picks a new boss when the current one dies, while guaranteeing you never accidentally get two bosses (split-brain), which corrupts data.
A coordination mechanism by which a set of nodes agree on a single 'leader' responsible for some exclusive role: accepting writes, assigning work, or coordinating others. It's a core building block implemented by consensus protocols (Raft, Paxos, ZAB) and coordination services (ZooKeeper, etcd).
Many tasks must be done by exactly one node: a single write primary, a single job scheduler, a single sequence generator. Hard-coding the leader makes it a single point of failure. Leader election lets the cluster choose a leader dynamically and, critically, elect a new one when the leader fails, without two nodes both believing they're in charge (split-brain), which causes divergent writes and data corruption.
Nodes detect leader failure via heartbeats. When the leader's heartbeats stop, candidates start an election. Consensus protocols require a candidate to win votes from a majority quorum (N/2+1) before becoming leader, and this is what prevents two leaders, since two different majorities can't exist simultaneously. The new leader operates within a bounded 'term'/'epoch'; stale leaders that come back are fenced off by the higher epoch number. Coordination services expose this as ephemeral nodes or leases: hold the lease (renewed via heartbeat) and you're leader; lose it and someone else takes over.
Each partition has an elected leader replica that handles all reads/writes; on broker failure a new leader is elected from the in-sync replicas
ZooKeeper/etcd elect a coordinator for ring membership and config; cache nodes follow the elected leader's view
The single active matching engine per symbol is the leader; a hot standby is promoted via election on failure to keep one authoritative order book
Leader election comes up whenever you have a single-writer system or a distributed scheduler. The key insight to convey is that you can't do leader election with just timeouts because a partitioned node can still think it's the leader. You must use a quorum-based protocol (Raft) or a coordination service (etcd, ZooKeeper). Mention fencing tokens to show you know how to handle a 'zombie leader' that comes back from a partition. Candidates who say 'just use a heartbeat and promote on timeout' will face immediate follow-up questions about split-brain they can't answer.