LV 5 Flashcards
Why elastic workload?
Reduce over/under provisioning
Reduce const
Increase customer satisfaction
Scalability limit
Web: overhead with parallelisation, sequential part dominates the execution
Transaction based apps: shared resources (e.g database)
Scalability
Characteristic of an application to increase capacity with the amount of resources
Capacity of application depends on
Available resource capacities
Application design
Elasticity
Dynamic adaptation of capacity to change in workload
No shutdown/restart required
Capacity planning in cloud
Possible due to dynamic resource management and pay per use cost model
High elasticity
Vertical Scaling - Scaling up
Increase capacity of singe Service instance by increasing its resources
( increase cpu time percentage, clock frequency, more cores)
Advantage: no change in service required
Horizontal Scaling - Scaling out
Capacity increase if service by creating more instances ( copies of services, load balancer on top)
Vertical scaling
Advantages: easy to replace resource with more powerful
No application redesign
Disadvantages:
More powerful resource might be too expensive
Resource capacity is limited
Replacement of resource causes service interruption
Horizontal scaling: pros and cons
Pros: scaling through adding more resources
No requirement for more powerful hardware
Cons: increased amount of resources comes with more management overhead
Required distributed software architecture
Long term solution !
SLO
service level objective
Latency of requests
Failed request rate
Service availability
Auto scaling policy
Analyzer -> scheduler -> scaling actions -> executer -> cloud CMDs
Policy -> scheduler
Auto scaling approach: reactive
Detect under/overloaded service
Scale in/out or down/up according to policy
Autoscaling approaches: scheduled
Policy specifies scaling events
Apply scaling actions at appropriate time
Autoscaling approaches: predictive
Continuously predict future workload
If workload change, schedule scaling actions ahead in time
Goals: circumvent scaling latency, enable more time consuming scaling decisions
Resource centric auto scaler
Scaling actions modify resources
Services are implicitly adapted
Service centric auto scalers
Scaling actions modify service instances
Resources are implicitly adapted
AWS Reactive Autoscaling
Resource centric: scaling of VM
AWS Scaling Policies
Target tracking scaling (specify target value, automatically adjust resources to meet target)
Simple scaling
Step scaling
AWS Predictive Autoscaling
Determines proactively minimum of Autoscaling group
Load balancing
Distributes requests among services
Scaling out: works only if all replicas are equally busy
Load balancing: goals
Efficient utilization of set of resources
Increase availability
Reduce response time and failure rates
Static vs dynamic load balancing
Static: no feedback from server
E.g round robin
Dynamic: feedback on the status
Dynamic load balancing
Distributed: shifts work between different nodes
Cooperative: have the same goal (optimize memory workload)
Non-cooperate: different goals( optimize cpu workload)
Non-distributed:
Centralized: one central LB
Semi-centralized: nodes are partitioned and one LB responsible for partition
Approaches for web apps
Round robin dns
DNS Delegation
Client-side random sampling
Server-side load balancing
Classes of LB Algorithms
Class-aware: classification of requests
Content-aware: request content
Client-aware: packet source
LB Algorithms
Round robin and weighted round robin
Least connection and weighted least connection
Resource based
Weighed response time
AWS Elastic load balancing
Distributes upcoming traffic across the instances in the auto scaling group