L5 - Autoscaling 2/2 Flashcards
What are cooldown periods for policies?
- only after this time, breaches are handled again
What does predictive autoscaling have to do with an autoscaling group?
It determines proactively the minimum of the autoscaling group and thus increases resources.
What is a load balancer (LB)?
load balancer = single point of request and it then it is distributed across service instances
When is LB important?
In the case of multiple service instances.
Why do you need a LB to scale out?
Scaling out only works if all replicas are equally busy. LB is required to distribute requests.
goals of LB?
- efficient utilization of a set of resources
- exploit aggregated capacity of replicas to reduce response time and failure rate
- increase availability (LB performs health checks, restarts faulty replicas/instances)
- enables non-disruptive management (in case of provisioning and de-provisioning of resources)
Are there multi-layer LB?
Yes
How is LB implemented?
- instances are allocated to certain VMs by the load balancer
- the VMs are load balanced to servers
- the LBs on different levels should interact with each other and exchange information
What is the difference between static and dynamic LB?
- in static LB there is no feedback (e.g. weighted round robin)
- in dynamic LB there is feedback on the status of the servers
Dynamic LB Diagram
What are the two sub-categories of dynamic LB?
Distributed and non-distributed
What are the two sub-categories of distributed LB?
Cooperative and non-cooperative
What are the two sub-categories of non-distributed LB?
Central and semi-distributed
What is distributed LB?
Nodes collaborate
What is cooperative LB?
- nodes have the same goal (e.g. optimize memory)
What is non-cooperative LB?
- nodes have different goals
What is centralized LB?
- one central LB
What is semi-distributed LB?
Nodes are partitioned and one LB is responsible for partition.
4 approaches for web applications
- round-robin DNS
- DNS delegation
- Client-side random sampling
- Server-side load balancing
Round-robin DNS
- domain name is mapped to multiple IP addresses
- IP addresses are given to clients in RR fashion
name server
Name servers work as a directory that translates domain names into IP addresses.
DNS delegation
- structure domain (e.g. tum.de) in two zones
- each zone has it’s own name server
- DNS request is forwarded to both zones
- the one resolving the address first wins the request
What is DNS?
The Domain Name System (DNS) is the phonebook of the Internet. Humans access information online through domain names, like nytimes.com or espn.com. Web browsers interact through Internet Protocol (IP) addresses. DNS translates domain names to IP addresses so browsers can load Internet resources.
Client-side random sampling
- client receives a list of IP addresses
- it selects randomly one to connect to
Server-side load balancing
LB receives requests at a given port and distributes them
What is session persistence in LB?
Session persistence ensures that a client will remain connected to the same server throughout a session or period of time. Because load balancing may, by default, send users to unique servers each time they connect, this can mean that complicated or repeated requests are slowed down.
stickiness/Session persistence = results in a “sticky session” between a user and a particular server. In this process, a load balancer uses logic to find an affinity between a specific network server and a client for the length of an entire session, defined by the amount of time a unique IP address stays on the site.
How does a LB create sticky sessions?
A load balancer creates sticky sessions by either tracking a user’s IP details or using a cookie to assign that user an identifying attribute. This allows the load balancer to use the tracking ID to route all of that user’s requests to a specific server throughout the session.
3 classes of LB algorithms
class-aware
content-aware
client-aware
What is class-aware LB
based on classification of requests into the classes: sensitive, best-effort, undesired
eg. based on port numbers (e.g. port 1 is sensitive)
What is content-aware LB?
based on request content e.g. URL
e.g. direcitng similar requests to the same server to exploit access to same information
What is client-aware LB?
based on packet source
can improve performance as before
4 LB algorithms
- Round Robin (RR) and Weighted Round Robin
- Least connection and weighted least connection
- resource based
- weighted response time
Round Robin (RR) and Weighted Round Robin
RR is very simple and just allocates people to servers sequentially. Not good if some people use servers for a long time. → use smart load balancing.
Processors circularly assign each process without defining any priority. This results in a faster response in case of similar workload distribution among the processes. All the processes have different loading time. Therefore, some nodes might be heavily loaded, while the others may remain under-utilized
weight represents capability of server in weighted Round Robin
Least connection and weighted least connection
distributes to server with the least number of active connections
Checks which servers have the fewest connections open at the time and sends traffic to those servers. This assumes all connections require roughly equal processing power.
resource based LB algorithm
CPU load of the servers is taken into account
Distributes load based on what resources each server has available at the time. Specialized software (called an “agent”) running on each server measures that server’s available CPU and memory, and the load balancer queries the agent before distributing traffic to that server.
weighted response time LB algorithm
the response time for a health check is used to compute the weights
AWS Elastic Load Balancing (ELB)
- Distributes incoming traffic across the instances in the Auto Scaling Group
- Can use load balancer metrics (request counts per target) for auto scaling
- Can use it for health checks (elastic load balancer sends health check messages to instance to find out if they are active or not)
Classic Load Balancer (old; 2009)
Distributes requests evenly across availability zones or evenly across all registered instances in the target group.
Application Load Balancer (VPC only, Application Layer)
Routes http requests based on contents to specific target groups
Network Load Balancer (VPC only, Transport Layer)
Forwards TCP packets for a certain port to a target group.
Gateway Load Balancer (VPC only, Transport and Network Layer)
Forwards ingress traffic to network appliances, like intrusion detection or monitoring
Forwards response traffic from network appliances to target after inspection
Hypertext Transfer Protocol (HTTP) (from internet)
HTTP is a protocol which fetches resources such as HTML documents. It is used for exchanging data on the Web and is a client-server protocol which means requests are initiated by the recipient usually the Web browser.
HTTP contains specific instructions on how to read and process this data once it arrives.
When you type a URL into your web browser, you are sending an HTTP request to a web server. That server will then respond, again using the formatting of HTTP.