Old 2017 Well Architected Framework WP - Reliability Flashcards
Reliability Pillar description
Ability of a system to recover from disruptions and the ability to scale to meet demand
Reliability Design Principles
Test Recovery procedures
Automatically recover from failure
Scale horizontally to increase system availability
Stop guessing capacity
Definitions / 3 Areas
Foundations
Change Mgmt
Failure Mgmt
Foundations
One of the first things to consider is capacity of your comms between HQ and data center
Re-provisioning can take months to change
AWS provides much of the foundations for you
AWS does have service limits to prevent customers from over-provisioning
Foundations Questions
how are you managing AWS service limits
How are you planning network topology
do you have an escalation path for technical issues
Change Management best practices
Monitoring lets you detect changes
Cloudwatch makes it easier to monitor and autoscale
Change management questions
how do you adapt to changes
how are you monitoring
how are you executing change management
Failure Management
always architect with assumptions that failure will occur
Plan responses and preventions
Failure management questions
How are you backing up data
how does system withstand failures
are you planning for recovery
Key AWS Services
Foundations: IAM, VPC
Change Mgmt: Cloudtrail
Failure Mgmt: Cloudformation,
Exam tips
remember 3 areas: foundations, change and failure mgmt
questions for the 3 areas