Big TechJanuary 24, 2026·7 min read·Original: Uber Engineering Blog ↗

How Uber Computes Surge Pricing in Real Time Across Every City

Uber's dispatch and pricing systems need sub-second latency while reading live supply/demand across millions of driver and rider events per minute.

real-timegeospatialpricingstream processing

Surge pricing is one of the most computationally and politically interesting systems Uber runs. It has to compute, for every 'hex cell' on Earth where Uber operates, the current supply/demand ratio, and translate that into a multiplier in under a second, continuously. Here's the architecture that makes it work.

The unit of computation: geohex cells

Uber divides every city into hexagonal grid cells (using H3, their open-source geo-indexing library). Each cell is roughly the size of a city block. Surge is computed per cell: if cell X has 3 idle drivers and 20 open ride requests, that cell is in surge. Adjacent cells with different ratios may have different surge multipliers, which is why the same pickup address can show different pricing depending on exactly where you are standing. The hex grid is the key insight: it lets Uber parallelize surge computation across millions of cells independently.

The event stream

Every driver location ping (every 4 seconds), every rider open-app event, every trip request and trip end flows through Kafka topics partitioned by geographic region. Uber's stream processing tier (Flink jobs) consumes these streams and maintains a rolling aggregate per hex cell: how many available drivers, how many open requests, rolling averages over the last N minutes. The window size matters. Too short and you get noisy surge spikes that feel manipulative. Too long and the system is slow to respond to real demand.

Computing the multiplier

The surge multiplier is not just supply/demand ratio. Uber applies a piecewise function tuned per city and time of day, capped at regulatory limits in some markets, and smoothed with hysteresis so prices don't oscillate wildly second to second. The output (a float per hex cell) is written to a geospatially-indexed read store (a variant of consistent hashing maps cell IDs to read replicas) so that any pricing query for a coordinate can resolve to the correct cell and read the latest multiplier with a single lookup.

Consistency vs. latency trade-off

Uber explicitly accepts stale surge data in the pricing display layer. When a rider opens the app, they see the surge multiplier that was computed up to a few seconds ago. This is a deliberate availability-over-consistency choice: waiting for the freshest possible surge read would add latency to every app open, and the difference in multiplier is rarely meaningful. The surge shown at request time is guaranteed though: it's locked at dispatch and the rider can't be surprised by a higher charge.

Lessons for your systems

Surge pricing is a beautiful example of a system where the right data model (hex cells as independent units) enables massive parallelism, and where explicit consistency trade-offs (stale display, guaranteed charge) make the system tractable. When you design real-time aggregation: pick your unit of computation carefully, partition your event stream by that unit, and write out the results in a way that reads cheaply. The consistency questions aren't implementation details. They're the core design decision.

Explore the concepts

Consistent Hashing →CAP Theorem →

See it in action

Uber (Driver Matching) →

← Back to all articles