← Concepts
Operations·3 min read

Observability (Logs, Metrics, Traces)

The three pillars (logs, metrics, traces) that let you ask new questions about a live system you didn't anticipate.

First time reading this? Start here

Plain English: observability is being able to figure out what's wrong with a running system from the outside. Metrics tell you something is broken (latency spiked), logs tell you what happened (this error), and traces tell you where in the chain of services it happened. You need all three.

Used in:NetflixPayment GatewayUber
What it is

The property of being able to understand a system's internal state from its external outputs. It rests on three pillars: metrics (numeric time-series like request rate, error rate, latency percentiles), logs (timestamped event records, ideally structured), and traces (the path of a single request across many services). The goal is to answer questions you didn't pre-define, not just watch fixed dashboards.

The problem it solves

In a distributed system, a user-facing slowdown could come from any of dozens of services, a database, a cache, or the network. Without observability you're guessing. Metrics tell you something is wrong and alert you; logs tell you what specifically happened; traces tell you where in the request path the time or error originated. Monitoring answers known questions ('is CPU high?'); observability lets you explore unknown ones ('why are checkouts from this region slow only on Tuesdays?').

How it works

Metrics: services emit counters/gauges/histograms (Prometheus-style); you aggregate and alert on rates and percentiles. Logs: emit structured (JSON) events with a correlation/trace ID, ship them to a central store (ELK, Loki) for search. Traces: propagate a trace ID across every service hop (via headers); each service records spans with timing; a tracing backend (Jaeger, Tempo) reconstructs the full request waterfall. The three are tied together by shared IDs so you can pivot from a spiking metric to the logs and the trace behind it.

Why use it
What it costs you
Where it shows up in our architectures
Gotchas

Your notes

Private to you