A distributed lock seems simple: one process holds it, others wait. In practice, clocks drift, processes pause, and networks lie, which makes every simple lock scheme subtly broken.
Every engineer eventually needs to ensure only one process does something at a time across a distributed system: only one cron job fires, only one payment processes, only one user modifies a resource. The instinct is to reach for a distributed lock. Redis has Redlock. ZooKeeper has ephemeral nodes. etcd has leases. They all work, until they don't, in ways that are genuinely hard to reason about.
Here is a common pattern: acquire a Redis lock with a TTL, do some work, release the lock. If the process crashes, the TTL ensures the lock expires and another process can proceed. This works fine under normal conditions. It breaks in three specific ways that are easy to dismiss as unlikely and are actually guaranteed to happen at scale.
A process acquires a lock with a 30-second TTL. It does some work. Then the JVM stops the world for garbage collection, or the OS swaps the process out, or the network call blocks longer than expected. When the process resumes, it checks: do I still hold the lock? The TTL has not expired yet (from its perspective), so it proceeds. But it has been paused for 31 seconds. The lock expired, another process acquired it, and now two processes are inside the critical section simultaneously. The TTL-expiry approach provides no protection against arbitrary process pauses. This is not theoretical: JVM GC pauses of 20-60 seconds have been observed in production under memory pressure.
Redlock (Redis's recommended distributed lock algorithm) works by acquiring locks on a majority of Redis nodes. It uses wall-clock time to determine whether a lock is still valid. If the clocks on two Redis nodes drift by more than a few seconds (which happens with NTP synchronization lag, VM clock jumps, or leap second handling), a lock that has expired from one node's perspective has not expired from another's. The result is two lock holders simultaneously, which is exactly what the lock is supposed to prevent. This is why Leslie Lamport's Paxos and Google's Chubby use logical clocks, not wall clocks.
A process holds a lock on a Redis primary. The primary crashes. Redis fails over to a replica. The replica has not yet replicated the lock entry (async replication). The new primary has no record of the existing lock, and a second process acquires it. Two lock holders, again. This is a consequence of Redis's default async replication: you get high availability OR lock safety, not both. Using WAIT to enforce synchronous replication before acknowledging the lock acquisition is the mitigation, at the cost of higher write latency.
The only mechanism that correctly handles process pauses is a fencing token: a monotonically increasing number returned with every lock acquisition. The lock server increments the token on every grant. When the lock holder performs a write (to a database, a file, an external API), it passes the token. The recipient rejects any write with a token lower than the highest token it has seen. Even if a paused process wakes up and tries to proceed, its stale token is rejected. The actual current lock holder's writes go through. This is how Chubby and etcd leases work in production-grade distributed systems.
Distributed locks are a synchronization primitive borrowed from single-machine programming. In distributed systems, they're often a sign that the data model needs rethinking. If you need a lock to prevent duplicate processing, use an idempotency key on the operation instead. If you need a lock to prevent conflicting writes, use optimistic concurrency (compare-and-swap) in the database instead. If you need a lock to elect a leader, use Raft or ZooKeeper, which are built specifically for this with the correct semantics. Reach for a distributed lock only after you've ruled out these alternatives.