Resilient Architecture Flashcards
Design HA and Fault Tolerant Systems
How do you autoscale in AWS?
- Setup an auto-scaling group.
- Setup a load balancer.
- Configure auto-scaling to listen to Cloudwatch alarms.
What is the difference between HA and Fault Tolerance and DR?
HA is to guarantee maximum uptime
- there can be minimal disruption to service but it is restored quickly
- an off-road vehicle carrying a spare tire encounters a flat
FT is to work through malfunctioning components in the system
- there typically cannot be any loss of functionality during component outage
- a plane in the air with engine failure uses redundant second engine
- a patient on an operation table on critical monitoring equipment that cannot stop functioning
- FT costs a lot to implement and is more complex in design than HA
DR is failure of a larger scale than affects HA or FT
- human induced or natural
- entire system is compromised or lost
- typically solved by having a second physical location to take over, far away from disaster site
- backups should be stored off site for on-prem solutions
- determine what your RTO and RPOs need to be for the use-case
What is Route53?
A DNS service from AWS
- Register domains, Global Service single DB
- Hosts Zone Files
- Managed Nameservers (NS) 4 per domain
- Liases with the TLD registrar and provides NS records where a particular domain resides (eg: )
- Zone files store record sets
How does DNS work? High level part 1
- Root Hints file on the DNS resolver (ISP provided) points to the 13 DNS Root servers where the Root Zone lies
- Root Zone is authoritative
- Root Zone is a DB of the top level domains (.com etc)
What are the different types of DNS records?
- “A” record points to the IPv4 of the server
- “CNAME” - canonical name - points to the “A” record and are alternate names pointing to the same IP (eg, ftp.google.com, mail.google.com)
- CNAME only can point to A names not to an IP address (exam question!)
- MX records: Points to a server for a specific mail domain
- TXT records: Arbitrary text to prove domain ownership
What does TTL on a DNS record indicate?
TTL values indicate how long the resolver can cache the IPv4 returned from the domain resolution
What is an ALB?
Application Load Balancer
- “Target” is a single compute resource
- “Target groups” are groups of targets
- Rules are evaluated to determine which target group to send requests to
- Rules are “path” based or “host” based
What are Launch configurations and Launch Templates?
- Templates came after Configurations
- Allows you to define the configuration an EC2 in advance (ami type, memory, networking, user data, iam role attached etc)
- LTs have versions, is recommended over LC
What is an ASG?
Auto Scaling Group
- automatic scaling for EC2
- uses the EC2 configuration within LTs or LCs
- 3 important values: Min size, Desired and Maximum (eg: 1:2:4)
- Provision or terminate to keep at Desired level
- Scaling policies based on Metrics
- Runs in a VPC across one or more Subnets
Types of Auto scaling?
- Manual
- Scheduled scaling based on time
- Dynamic scaling
- simple scaling based on a metric, example: cpu - if CPU > 50% increase desired capacity else remove 1 from desired capcity
- stepped scaling - lets you define more details - add one instance if cpu > 50%, add 3 instances if cpu > 80% (bigger or smaller steps), react in a more extreme way, preferable to simple
- target scaling: eg: 40% desired aggregate cpu across all instances in the group
What is cool down period?
EC2 has min billing so bringing in instances in and out too frequently can be costly
Cool down period waits for a the time period before a scaling action is applied since the last scaling action
What are NLBs?
Network Load balancer
Only understand TCP and UDP, non-HTTP(s)
~100ms vs ~400ms for ALBs
Rapid scaling - millions of requests per second
1 interface with static IP/AZ, can use EIPs
What is SSL Offload
ELBs have 3 types of SSL off load:
1. Bridging - SSL is terminated on the LB, LB needs an SSL Cert matching the domain name, new encrypted connection between ALB and EC2 instances (ALB decrypts and then re-encrypts when talking to EC2 instances so EC2 needs to decrypt which can be an overhead)
- Pass through - NLB usually uses this, does not decrypt, passes it through to EC2, cannot decrypt data, AWS does not know what cert you use on the EC2 instance, still has admin and compute overhead on EC2
- SSL Offload - ELB has cert, but cert not needed on EC2 instance since connection is not HTTPS. Only ELB decrypts, so no overhead on the EC2 instances
What is Session Stickiness?
If enabled, the LB generates a cookie called “aws-alb”
Duration defined by you (1s to 7days)
LB will go to the same backend EC2 instance if the cookie is present
What is Boot time to service time?
Time required by AWS to provision EC2, software updates and installation within the OS - for AWS provided AMIs that is in mins.