Load Balancing Flashcards
Load Balancing
the process of distributing tasks over a set of computing nodes to improve the performance and reliability of the system. A load balancer can be a hardware or software system, and it has implications for security, user sessions, and caching.
Advantages of load balancing
- Enable horizontal scaling: spinning up multiple instances of a service is not possible without a load balancer directing traffic efficiently to the cluster.
- Dynamic scaling: it’s possible to seamlessly add and remove servers to respond to load.
- Abstraction: the end user only needs to know the address of the load balancer, not the address of every server in the cluster.
- Throughput: service availability and response time are unaffected by overall traffic.
- Redundancy: distributing load over a cluster means no one server is a single point of failure. Note that the load balancer itself must also not become a single point of failure.
- Continuous deployment: it’s possible to roll out software updates without taking the whole service down, by using the load balancer to take out one machine at a time.
hardware vs. software load balancers
Hardware load balancers are specialized appliances with circuitry designed to perform load balancing tasks. They are generally very performant and very expensive. Hardware LBs are generally L4 LBs, because L7 decisions are more complex and need to be updated more often.
With a software load balancer, requests are simply routed to the load balancer server instead of an application server.
DNS load balancer
DNS load-balancers integrate with the Domain Name Services (DNS) infrastructure to cause a client’s name lookup for a service (e.g., for “www.google.com”) to return a different IP address to each requester corresponding to one of a pool of back-end servers in their geographic location.
L4 load balancer
An L4 load balancer acts on information found in the network and transport layers of the request. This includes the protocol (TCP, UDP, etc.), source and destination IP addresses, and source and destination port numbers. The L4 LB doesn’t have access to the contents of the request like application-level headers or the requested URL As a result, the routing decisions are based entirely on request headers at L4 and below. For example, an L4 LB can route requests from the same IP address to the same server every time.
L7 load balancer
L7 LBs have access to the full information carried by the request at the application layer, and as a result they can make more sophisticated routing decisions. For example, they can route requests for video content to a pool of servers optimized for video, requests for static content to a different set of servers, etc. They can also route requests based on the user, so that the same user always lands on the same server (for session stickiness) or the same pool of servers (for performance reasons).
In order to look inside the request an L7 LB needs to handle decryption. This means the LB does TLS termination so the TCP connection ends at the load balancer. The benefit of this is it frees backend servers from having to perform decryption (a performance-intensive task) and from managing certificates. On the flipside, the load balancing layer becomes a more concentrated attack surface.
Static Load balancing algorithm
Static algorithms function the same regardless of the state of the back end serving the requests
Dynamic load balancing algorithm
dynamic algorithms take into account the state of the backend and consider system load when routing requests. Dynamic algorithms are more complex and entail more overhead, as they require communication between the LB and the back-end servers, but can more efficiently distribute requests.
Round Robin algorithm
Static. One of the simplest and most used. The load balancer maintains a list of available servers, and routes the first request to the first server, the second request to the second server, and so on. This works well if every server in the pool has roughly the same performance characteristics.
Weighted Round Robin algorithm
takes server characteristics into account, so that servers with more resources get proportionally more of the requests. The advantages of this algorithm are that it is simple, very fast, and easy to understand. The disadvantages are that the current load of each server is not taken into account, and that the relative computation cost of each request is also not taken into account.
Random algorithm
Static. another simple and popular approach. Requests are sent to random servers, which can be weighted by server capacity. Random works well for systems with a large number of requests because the law of large numbers means randomization will tend towards a uniform distribution. It also works well with running several LBs at the same time, because they don’t need to coordinate.
User IP hashing algorithm
Static. the same user always goes to the same server. If the assigned server ever goes down the distribution is rebalanced. In this way we get session stickiness for free, and some caching wins.
URL hashing algorithm
Static. maps requests for the same content to the same server, which helps with server specialization and caching. Since URL hashing relies on the content of requests to make load balancing decisions, it can only be implemented in L7.
Least load algorithm
requests are sent to the server with the lowest load at the time of request. Load can be measured with a variety of metrics, including number of connections, amount of traffic, and request latency.