Horizontal vs. Vertical Scaling Flashcards
What is vertical scaling?
An example would be increasing the size of the virtual machine, like memory, cpu, etc.
Limited by the largest machine option available
What is horizontal scaling?
An example would be to make use of multiple virtual machines to spread out the work
Benefits:
1. Horizontal scaling has no limit, except maybe the cost
2. Increased high availability (can make use of multiple AZs)
3. Increased redundancy
4. Autoscaling options are available
What are the 3Ws of scaling?
- What do we scale? What resources are we going to scale? How do you define the template?
- Where do we scale? When applying the model, where does it go? Should we scale out databases, or web servers?
- When do we scale? How do we know we need more? CloudWatch alarms can tell us when it’s time to add more resources.
What is a launch template?
A launch template specifies all of the needed settings to go into building out an EC2 instance.
It is a collection of settings that you can configure so you don’t have to walk through the EC2 wizard over and over.
What is the difference between templates and configurations?
Templates:
1. More than just autoscaling
2. Supports versioning
3. More granularity
4. AWS supported
Configurations:
1. Only for autoscaling
2. Immutable
3. Limited configuration options
4. Don’t use them
What settings are available in an EC2 launch template?
- AMI
- EC2 instance size
- Security groups
- Networking information
- User data
If you define networking information in an EC2 launch template, what options won’t be available?
You can’t use the launch template in an auto-scaling group
What are auto-scaling groups in EC2?
Contains a collection of EC2 instances that are treated as a collective group for the purpose of scaling and management
What are the steps to setting up an auto-scaling group in EC2?
- Define your template: Define your EC2 launch template (launch template of launch configuration
- Networking and purchasing: Pick your networking and purchasing options (using multiple AZs allows for high availability)
- ELB configuration: EC2 instances can be registered behind a load balancer; the auto-scaling group can be set to respect the load balancer health checks
- Set scaling policies: Minimum, maximum and desired capacity needs to be set or to ensure you don’t have too many or too few resources
- Notifications: SNS can act as a notification tool, allowing you to let someone know when the scaling event occurs
What are the three auto-scaling restrictions?
- Minimum - the lowest number of EC2 instances you’ll ever have online (likely the lowest would ever be 2)
- Maximum - the highest number of EC2 instances you’ll ever provision
- Desired - how many instances do you want right now? it will never be lower than the minimum and never be higher than the maximum
How can you save money when using auto-scaling in EC2?
You can select to use spot instances
What option in EC2 is integral to making an application highly available?
Auto-scaling
Remember to select answers that spread resources out over multiple AZs and utilize load balancers
What is EC2 instance warm-up in auto-scaling?
Stops instances from being placed behind the load balancer, failing the health check, and being terminated
What is EC2 instance cool-down in auto-scaling?
Pauses auto-scaling for a set amount of time; helps to avoid runaway scaling events (default window is 5 minutes)
What does it mean to “avoid thrashing” in auto-scaling?
Create instances quickly and spin them down slowly
What are the three types of auto-scaling?
- Reactive scaling - you’re playing catch-up; once the load is there, you measure it and then determine if you need to create more resources
- Scheduled scaling - if you have a predictable workload, create a scaling event to get your resources ready to go before they are actually needed
- Predictive scaling - AWS uses its machine learning algorithms to determine when you’ll need to scale; they are reevaluated every 24 hours to create a forecast for the next 48
If you have a legacy application where you can’t have multiple instances running at the same time, but you want high-availability. How would you set it up?
Create an auto-scaling group with a minimum, maximum and desired count of 1, and select multiple AZs that it can run in. If the instance is terminated, it will likely spin up in another AZ.
Called a “steady-state group”.
If you are given a scenario where you are asked to pick a scaling solution to handle the increase in CPU for your application. What option would you recommend?
- Create an auto-scaling group with 5,000 instances and they are all R5 24 XLs
- Create a very gradual increase and decrease of my instance count using reasonably sized EC2 instances
Obviously it would be #2 because it is the more cost-reasonable option
What are the thoughts to consider when building auto-scaling policy in EC2?
- Scale out aggressively
- Scale in conservatively
- Provisioning - keep an eye on provisioning times; bake those AMIs to minimize it
- Costs - Use EC2 Reserved Instances for minimum count of EC2 instances; potentially use Spot Instances to scale with and then fail back on-demand
- CloudWatch is your #1 tool for alerting auto-scaling that you need more or less of something
What are the four types of scaling to adjust relational database performance?
- Vertical scaling - resizing the database from one size to another can create greater performance
- Scaling storage - storage can be resized but it is only able to go up, not down (Aurora automatically scales in 10GB increments, the rest are manual)
- Read replicas - creating read-only copies of our data can help spread out the workload
- Aurora serverless - offload scaling to AWS; excels with unpredictable workloads
If you are given a scenario where you have a read-heavy workload, what option should you consider?
Read replicas
If you are given a scenarios giving the option to refactor and switch a database type to solve a scaling problem, would you select that option?
Yes, it is possible this could be the best option for the exam (even if in the real world it would be less likely)
Don’t be shy in picking an option of switching from relational to non-relational databases, especially in favor of DynamoDB, because it is easier to scale and is more managed
If non-relational isn’t an option, consider Aurora heavily
What are the DynamoDB scaling options?
- Provisioned
Use Case: Generally predictable workload
Effort to Use: Need to review past usage to set upper and lower scaling bounds
Cost: Most cost-effective model
- On-Demand
Use Case: Sporadic workload
Effort to Use: Simply select “on-demand”
Cost: Pay small amount of money per read and write; less cost-effective
When considering cost vs performance between two options related to DynamoDB scaling, what option should you consider most?
Cost