What Is a Load Balancer?
What It Is
A load balancer is a traffic director that routes incoming requests across multiple backend servers. It keeps any single server from drowning under demand.
It is a stateless gateway that sits at the edge of a fleet, making a single decision for every request: which machine handles this?
Why It Matters
Systems break at the edges. Not in the code. In the traffic.
A load balancer prevents one server from becoming a bottleneck and a single point of failure. It replaces the fragility of a lone server with the resilience of a distributed group.
The deeper point: who decides where traffic goes is a question of power. In a closed system, the decision is hidden in vendor magic. In an open system, the decision is transparent, deterministic, and auditable.
Load balancing is one of the foundational primitives of distributed systems. Without it, horizontal scaling is a fiction.
How It Works
Step 1. A request arrives.
A user hits an IP or a DNS name. That request lands on the load balancer first.
Step 2. The balancer picks a backend.
Using a rule, it selects one server from a pool. The rules are simple and explicit:
- Round-robin: Take turns. Server A, then B, then C, then A again.
- Least connections: Route to the server with the fewest active requests.
- Weighted round-robin: Server A gets 70% of traffic, Server B gets 30%.
- IP hash: The same client IP always hits the same server. Useful for session affinity.
- Health-based: Do not send traffic to a server that is down or unhealthy.
Step 3. The request forwards.
The balancer opens a connection to the chosen backend and pipes the request through.
Step 4. The response returns.
The backend replies. The balancer passes the response back to the client. The client does not know the backend exists.
Step 5. Health checks run continuously.
Every backend is probed on a cadence. If a server fails its health check, it is removed from the pool. If it recovers, it is reinstated. This is automatic. This is the mechanism that makes the system self-healing.
The Contract
The interface of a load balancer is formal and unambiguous.
Input: A request from a client.
Output: That request routed to a healthy backend, and the backend response returned to the client.
Invariants:
- No request is dropped unless every backend is down.
- A backend is removed from the pool if it fails its health check.
- A backend is restored to the pool if it passes its health check.
- The routing rule is deterministic and reproducible for the same inputs.
- The client is never aware of the backend.
Failure modes:
- If all backends are down, the balancer returns a 503.
- If a backend fails mid-request, the balancer retries on a different backend (if configured).
- If the balancer itself is a single point of failure, the architecture is broken.
Real Examples
NGINX reverse proxy with upstream.
You define an upstream block with three backends. NGINX routes each request in round-robin. The configuration is a flat file. The behavior is auditable.
Cloudflare Load Balancer.
Global traffic routing across data centers. The balancer makes a geographic decision: a user in London hits a server in London, not a server in Los Angeles. The health check is a synthetic HTTP probe every 15 seconds.
AWS Elastic Load Balancer (ALB).
Layer 7 routing. The balancer inspects the HTTP path: /api/ goes to the API fleet. /static/ goes to the static fleet. Different rules, different backends, one entry point.
Kubernetes Service.
A Kubernetes Service with type: LoadBalancer provisions an external IP and routes traffic to matching pods. If a pod dies, the Service stops sending it traffic. The health check is the liveness probe.
haproxy on bare metal.
In high-frequency trading or telecommunications, haproxy runs on a pair of physical machines with a floating virtual IP. Keepalived shifts the IP between two balancer machines if one fails. Zero downtime. Zero ambiguity.
Common Mistakes
Treating the load balancer as invisible.
It is a machine. It has a config. It can be misconfigured. A bad rule routes all traffic to one server. A missing health check lets a dead server keep eating requests. Audit the balancer.
Ignoring the balancer as a single point of failure.
If you have one load balancer, you have one load balancer. If it dies, everything dies. Run two. Use a floating IP. Use DNS failover. Redundancy at the edge matters.
Session affinity without a session store.
If a server handles a login and a sticky IP route sends the next request to a different server, the user is logged out. Session affinity is a hack. Use a shared session store or a stateless token.
Health checks that are too optimistic.
If the health check pings /health and the server returns 200 but is actually on fire, the balancer thinks the server is fine. Health checks must test the actual capacity to serve, not just the capacity to return 200.
Connection draining during deploys.
If you deploy a new backend and kill the old one instantly, active requests are dropped. Connection draining waits for in-flight requests to finish before removing a server from the pool. This is not optional.
Connection to OIP
The Open Internet Protocol is built on three principles: open, deterministic, auditable.
A load balancer is the embodiment of all three.
Open: The routing rule is not a secret. It is in a config file. Any operator can read it, modify it, and understand why traffic flows the way it does.
Deterministic: The same request, under the same conditions, routes to the same backend. The rule is not probabilistic. It is not magic. It is code.
Auditable: Every routing decision is a log line. Every health check is a timestamp. Every backend failure and recovery is recorded. You can trace the behavior of the system over time without asking a vendor.
In an OIP system, the load balancer is not a vendor appliance. It is a contract, a set of rules, and a transparent decision engine that anyone can inspect, verify, and replace.
Connection to the Grain Philosophy
This protocol is part of the Open Inventory Protocol — a living system of self-describing voxels that serves the Grain philosophy. The OIP is the interface. The philosophy is the core.