Reqflow
← All concepts
Performance·3 min read

Vertical vs Horizontal Scaling

Buy a bigger machine (simple, capped, SPOF) or add more machines (unlimited, but now you're a distributed system).

Try it

Pick a scenario below and see which option fits, and why.

Vertical
a bigger machine
Horizontal
more machines
Scenarios

There is no universal winner here, only the right fit for a given situation. Each scenario above pushes the decision a different way, which is exactly how this tradeoff shows up in real design questions.

First time reading this? Start here

Plain English: vertical scaling means making one server bigger (more CPU/RAM). Horizontal scaling means adding more servers. Vertical is dead simple but hits a ceiling and is a single point of failure; horizontal scales forever but forces you to handle distribution.

What it is

Two ways to add capacity. Vertical scaling (scaling up) means giving a single node more resources: more CPU cores, RAM, faster disks. Horizontal scaling (scaling out) means adding more nodes and spreading load across them via a load balancer or partitioning scheme.

The problem it solves

Your service is out of capacity. Vertical scaling is the no-code answer (resize the box, done), and it's the right first move because it requires zero architectural change. But every machine has a ceiling (and the biggest boxes cost wildly more per unit). Horizontal scaling removes the ceiling entirely, at the price of making your app stateless and distribution-aware.

How it works

Vertical: stop the instance, pick a bigger instance type, restart, or hot-add resources where supported. Limited by the largest available hardware. Horizontal: run N identical stateless instances behind a load balancer; for stateful systems, shard the data across nodes (see sharding) and replicate for durability. Auto-scaling automates horizontal scaling in response to load.

Why use it

  • Vertical: zero code change, no distribution problems, simplest possible operation
  • Horizontal: effectively unlimited capacity, just keep adding nodes
  • Horizontal: redundancy comes for free, since one of N nodes dying isn't an outage

What it costs you

  • Vertical: hard hardware ceiling, super-linear cost at the top end, and the big box is a single point of failure
  • Horizontal: requires stateless app design or sharding, so you've now opted into a distributed system
  • Horizontal: needs a load balancer, health checks, and coordination, so operational complexity goes up

Where it shows up in our architectures

  • Stock Exchange (Matching Engine)

    Scales vertically by design: the per-symbol matching engine is single-threaded and in-memory, so a bigger, faster box is the only lever for that hot path

  • URL Shortener

    Stateless read/write services scale horizontally behind a load balancer; Postgres scales vertically first, then with read replicas

  • Instagram Feed

    Feed-serving tier scales out horizontally; the Redis timeline cache scales out via sharding

Gotchas

  • Scale vertically first. It buys time with zero architectural change. Only go horizontal when you hit the box ceiling or need redundancy you can't get from one machine.
  • A single big box is a single point of failure no matter how reliable the hardware. Vertical scaling alone gives you zero redundancy.
  • Horizontal scaling only works if the tier is stateless (or sharded). If instances hold per-user state in memory, you've imported sticky-session problems.
  • Cost is super-linear at the top of the vertical curve: the largest instance types cost far more than 2× the mid-tier per unit of compute.
Interview angle

Vertical vs horizontal scaling is a framing question. When an interviewer asks how you'd handle 10x growth, the right answer starts with 'scale vertically first because it's zero code change, then move to horizontal when you hit the hardware ceiling or need redundancy.' The thing they're checking is whether you know horizontal scaling only works when the service is stateless. Candidates lose points by jumping straight to 'add more servers' without addressing session state or data partitioning.

Your notes

Private to you