Reliability·3 min read

Idempotency

An operation you can safely apply more than once and get the same result: the foundation of every retryable system.

Try it

Submit a payment, then hit retry (like a flaky network would).

Times the customer was charged

Send an idempotency key with the request

Networks retry. Without an idempotency key, a retried "charge card" runs twice. With one, the server remembers the key, so the duplicate returns the original result instead of doing the work again. Essential for anything that moves money or sends messages.

First time reading this? Start here

Plain English: 'set my balance to $100' is idempotent, so running it twice doesn't double-set anything. 'Add $10 to my balance' is NOT, so running it twice charges me extra. Networks lose responses constantly, so any operation that matters has to be designed so a retry doesn't break things.

Used in:Payment Gateway Apache Kafka Notification System

What it is

A property of an operation: F(x) == F(F(x)) == F(F(F(x))). 'Set balance to $100' is idempotent. 'Add $10 to balance' is not: retry it and the user is overcharged. Idempotency is what makes networks tolerable. Networks lose responses, timeouts lie, retries happen, and without idempotency, every retry is a potential bug.

The problem it solves

Every distributed system has at-least-once delivery somewhere. A request times out, the client retries, the server processes both, and now what? Without idempotency, you get duplicate charges, doubled inventory adjustments, duplicate emails. With idempotency, the second request is a no-op or returns the cached result.

How it works

Three common patterns: (1) Idempotency keys, where the client sends a unique key with each request; the server records it and returns the cached response on retry (Stripe's model). (2) Natural idempotence, where you design the operation so it's inherently safe to repeat (PUT, upserts, 'set' instead of 'add'). (3) Versioned writes, where you include an expected version; the second write fails the optimistic-lock check and is a no-op.

Why use it

Makes retries safe, so the entire failure-handling story gets simpler
Lets you use at-least-once delivery (cheap) instead of exactly-once (very expensive)
Standard pattern in payments, queues, webhooks, every reliable API

What it costs you

Storage cost: idempotency keys have to be remembered for some window (24h, 7d, forever)
Easy to get wrong, since partial idempotency (some side effects deduplicated, others not) is worse than none
Requires discipline on the client, where keys must actually be unique per logical operation, not per retry attempt

Where it shows up in our architectures

Payment Gateway →
Idempotency keys on every charge request, Stripe-style. Duplicate POSTs return the original response, no double-charge
Apache Kafka →
At-least-once delivery is the default. Consumers must be idempotent, processing the same message twice without effect
Notification System →
Send-notification accepts an idempotency key so a flaky publisher's retries don't spam the user

Gotchas

'Idempotent' means the *end state* is the same on retry, not necessarily that the second call returns the same response. (POST with an idempotency key usually returns the cached response; PUT just overwrites.)
Side effects beyond the database (emails, webhooks, queue publishes) need their own deduplication; DB idempotency alone doesn't cover them.
Idempotency keys must outlive the retry window. If your client retries for 24h but you only remember keys for 1h, you'll process a 'duplicate' as a new request.
Time-based keys are a bug magnet; use a UUID per logical operation, not a timestamp.

When this went wrong in production

Stripe double-charges thousands of customers · 2016

Postmortem ↗

A race condition in charge creation caused duplicate charges when clients retried on a slow response.

Stripe's charge API occasionally returned a timeout to clients. The HTTP connection dropped before the response arrived, even though the charge had already been created on Stripe's side. Well-behaved clients, following Stripe's own retry guidance, retried the request. Without idempotency keys, Stripe's backend treated the retry as a new charge and created a second one. Thousands of customers were double-billed before the incident was caught. Stripe rolled out idempotency key enforcement as a first-class API primitive: clients send a unique key per intended charge, and the backend deduplicates on that key no matter how many times the request arrives. The lesson: any operation that charges money, sends a message, or has real-world side effects must be idempotent end-to-end. Timeouts aren't errors; they're ambiguous. Design your API for that ambiguity.

All war stories →

Interview angle

Idempotency comes up in any payment or critical-write scenario. The interviewer wants you to say 'at-least-once delivery is fine as long as the consumer is idempotent' rather than trying to build exactly-once delivery (which is extremely expensive). Name the pattern: idempotency key sent by the client, stored by the server, with the result cached for the retry window. Candidates lose points by treating retries as a network problem to solve rather than a data design problem to absorb.

Your notes

Private to you