L5 - Autoscaling 1/2 Flashcards
Why an elastic application?
- reduce over/under-provisioning
- reduce cost + increase customer satisfaction
What are 4 typical resources applications use?
- CPU
- Memory
- Disk
- Network
Dynamism for desktop apps on the laptop
seconds, thread scheduling
Dynamism for HPC with a cluster as a shared resource
hours, days for job scheduling
Dynamism for banking with mainframe
periodically every day for processor allocation
Dynamism for web and server clusters
highly dynamic, limited predictability
What is increased throughput?
Ability to handle more workload (requests) in the same time
What is decreased latency?
Individual requests are handled faster
Can we normally decrease latency or increase throughput for web-applications?
Normally we can only increase throughput
Is there a scalability limit for throughput?
Yes, the curve converges to a certain limit in the long-run
Why is there a scalability limit for throughput?
- overhead with parallelization
- bottleneck: initiation of parallelization is a sequential process –> at a certain point, the sequential part dominates the execution (Amdahl’s law)
- shared databases limit the load that can be processed
- programming influences whether applications can scale
What is scalability of applications?
Characteristics of an application to increase its capacity (throughput)
What does the capacity of an application depend on?
- available resource capacities
- application design (whether the app is programmed for scalability)
What are scalability limits?
- maximum application capacity
- throughput can be limited by max resource capacities or application design
What happens when applications with poor scalability are scaled?
- significant drop in efficiency
What is speedup?
performace (p processors) / performance (1 processor)
e.g. for CPU the transactions per second
efficiency
efficiency (p processors) = speedup (p processors) / p