L5 - Autoscaling 1/2 Flashcards by Paolo Oppelt

Why an elastic application?

reduce over/under-provisioning
reduce cost + increase customer satisfaction

How well did you know this?

Not at all

Perfectly

What are 4 typical resources applications use?

CPU
Memory
Disk
Network

How well did you know this?

Not at all

Perfectly

Dynamism for desktop apps on the laptop

seconds, thread scheduling

How well did you know this?

Not at all

Perfectly

Dynamism for HPC with a cluster as a shared resource

hours, days for job scheduling

How well did you know this?

Not at all

Perfectly

Dynamism for banking with mainframe

periodically every day for processor allocation

How well did you know this?

Not at all

Perfectly

Dynamism for web and server clusters

highly dynamic, limited predictability

How well did you know this?

Not at all

Perfectly

What is increased throughput?

Ability to handle more workload (requests) in the same time

How well did you know this?

Not at all

Perfectly

What is decreased latency?

Individual requests are handled faster

How well did you know this?

Not at all

Perfectly

Can we normally decrease latency or increase throughput for web-applications?

Normally we can only increase throughput

How well did you know this?

Not at all

Perfectly

Is there a scalability limit for throughput?

Yes, the curve converges to a certain limit in the long-run

How well did you know this?

Not at all

Perfectly

Why is there a scalability limit for throughput?

overhead with parallelization
bottleneck: initiation of parallelization is a sequential process –> at a certain point, the sequential part dominates the execution (Amdahl’s law)
shared databases limit the load that can be processed
programming influences whether applications can scale

How well did you know this?

Not at all

Perfectly

What is scalability of applications?

Characteristics of an application to increase its capacity (throughput)

How well did you know this?

Not at all

Perfectly

What does the capacity of an application depend on?

available resource capacities
application design (whether the app is programmed for scalability)

How well did you know this?

Not at all

Perfectly

What are scalability limits?

maximum application capacity
throughput can be limited by max resource capacities or application design

How well did you know this?

Not at all

Perfectly

What happens when applications with poor scalability are scaled?

significant drop in efficiency

How well did you know this?

Not at all

Perfectly

What is speedup?

performace (p processors) / performance (1 processor)

e.g. for CPU the transactions per second

How well did you know this?

Not at all

Perfectly

efficiency

efficiency (p processors) = speedup (p processors) / p

How well did you know this?

Not at all

Perfectly

Does speedup linearly scale?

Study These Flashcards

No. With one processor the efficiency = 1 but then the efficiency drops

What is parallel computing?

Study These Flashcards

Where many processors work simultaneously to produce exceptional computational power and to significantly reduce the total computational time.

What is elasticity?

Study These Flashcards

dynamic adaptation of the capacity to a change in the workload
no shutdown/restart required
shrink capacity, if workload decreases
increase capacity, if workload increases

What is autoscaling?

Study These Flashcards

Cloud computing feature that enables organizations to scale cloud services such as server capacities or VMs up or down automatically, based on defined situations such as traffic or utilization levels.

What is a backend service?

Study These Flashcards

Needed to answer requests that arrive at the frontend

What is vertical scaling - scaling up?

Study These Flashcards

Scale the server on which the service is running.
You can increase the capacity of a single service instance by increasing its resources:

increase CPU time percentage
increase clock frequency
add more cores
replace existing resources with more powerful ones

Pros of vertical scaling

Study These Flashcards

easy to replace a resource with a more powerful one
it does not require a re-design of the application

Cons of vertical scaling

- more powerful resources might be too expensive - resource capacity is limited - replacement of resources cause service interruption

What is horizontal scaling/ scaling out?

- capacity increase of service by creating more instances (assumption = each service instance comes with its own resources)

What are the pros of horizontal scaling?

- no requirement for more powerful hardware - provides a long term solution for scaling

What are the cons of horizontal scaling?

- increased amount of resources comes with more management overhead - horizontal scaling requires a distributed software architecture

What is an auto-scaler?

System that defines how many servers (resources) are provided to the application. The monitor (e.g. cloud watch) measures metrics from servers which are then provided to the auto-scaler.

What is the autoscaling policy about?

The autoscaling system uses this to adapt the amount of resources

3 autoscaling approaches

1. Reactive 2. Scheduled 3. Predictive

What is reactive autoscaling?

- detect under/overloaded service - scale in/out or down/up according to policy

What is scheduled autoscaling?

-policy specifies scaling events (time-stamped scaling actions) - apply scaling actions at appropriate time

What is predictive autoscaling?

- continuously predict future workloads - if workloads will change, schedule scaling actions ahead in time - lets you circumvent scaling latency and enables more time consuming scaling decisions

Two types of auto-scalers

- resource centric - service centric

What is a resource-centric auto-scaler?

- scaling actions modify resources - services are implicitly adapted

What is a service-centric auto-scaler?

- scaling actions modify the number of service instances - resources are implicitly adapted

What is AWS reactive autoscaling?

resource centric, scaling the number of VMs

What is the AWS Auto Scaling Group?

- set of VMs with same launch template - contains a collection of EC2 instances (virtual servers) that are treated as a logical grouping for the purpose of automatic scaling and management. - optionally have a load balancer to scale out by creating more instances of the launch template -

AWS Scaling Policies

- target tracking scaling - simple scaling - step scaling

What is target tracking scaling?

- automatically adjust resources to meet target

What is simple scaling?

- trigger based on: metric, threshold, condition e.g. metric > threshold: - #VMs e.g. we want a CPU load of 50% if it is higher we scale out and if it goes below 50% we scale in. (you increase by a fixed number of #VMs or a fixed percentage once the threshold is passed).

What is step scaling?

- depends on amount of breach specify metric, threshold, steps based on amount 0 to 10%: 0% 10 to 20%: 10% 20 to infinity%: 30% 0 to minus infinity%: 10%

L5 - Autoscaling 1/2 Flashcards

(43 cards)