C5 Flashcards
distributed system
consists of multiple individual independent computers that communicate with each other
eg. cluster computer, interconnected network switchers and routers, peer-to-peer systems, cloud computing
goal: be resilient to failures
horizontal scaling
adding more instances
- not as fast as vertical scaling
- if a full VM is not required, consider containers which can be spawned much quicker
vertical scaling
increasing instance resources
- fast, but limited resources available in host computer
- several cloud providers don’t support online modification of a VM’s resources
problems with scaling classic applications
session data: application layer often has state => make this tier stateless by pushing data into shared storage or cache
database tier: often hard to scale or unscalable (would need database cluster or distributed database)
service-based architecture
develop multiple stand-alone services and build applications on top of this
- services interact over the network using pre-defined APIs
- no clear tier model, architecture is typically a graph of services
- all of these different services can be scaled
elasticity
the ability to grow and shrink our pool of instances on demand
MAPE for auto-scaling
Monitoring: monitor performance indicators such as request rate and CPU ultilization, and select right monitoring interval
Analysis: at what time to scale? scale proactively based on workload prediction
Planning: plan next scaling action, by what factor to scale, horizontal or vertical, what kind of resources
Execution: execute the plan through cloud API
low-level metrics
- CPU ultilization
- memory statistics
- network bandwidth in/out
cloud provider customers won’t have access to low-level statistics of hypervisor
high-level metrics
- request rate
- average response time
- session creation rate
- service time / request latency
not available to a cloud provider: application-dependent
resource estimation
how many resources do we need?
need to determine the minimum amount of computing resources necessary to process a
workload and when and how to perform scaling
- rule-based: if metric reaches X, add N instances
- application profiling: determine saturation point of a single instance through benchmarking
- machine learning
- analytical modeling
load balancers
assign clients to different servers to balance the load
- originally hardware products, later in software
- LBaaS