AZ-400 Cloud Guru Notes Flashcards
What is reliability?
Access app when & how expected
- Availability
- Performance
- Latency
- Security
How to measure reliability?
- Determine acceptable uptimes
Define SLA, SLO and SLI
Service Level Agreement - agreement around availability
Service Level Objective - goal service wants to reach to meet agreement
Service Level Indicator - metrics used to meet commitment
What is Site Reliability Engineering?
Engineers working the full stack so same devs who wrote code fix issues and automate performance checks
Key concepts of SRE
Feedback - loops for continuous improvement
Measure everything - collect data on how everything is working
Alerts - use data to create alerts
Automation - automate as much as possible to reduce alerts
Small changes - fixes little and often
Risk - embrace risk to learn from problems
Azure Monitor
Central location to gather info from app, infrastructure and Azure resources
Azure Monitor collects?
Metrics - Data that measures how system is performing at a certain time
Logs - Messages about certain events in a system
Metrics usage
Visualise data in Metric explorer, save to dashboard/workbook
Anaylse performance
Alert if resource reaches threshold
Automate action based on metrics e.g. autoscaling
What can app insights provide?
- App exceptions
- App dependencies
- Performance information
URL ping test
Send HTTP req to specific URL to test availability
Lets you know duration and success rate
Enable retries recommended
Can have multiple test locations
Can have alerts
What is an action group?
Tells you who to notify and how
What is failure mode analysis?
Find single points of failure
Rate risk and severity
Determine response - respond and recover
What are fault points and fault modes?
Fault points - any place where architecture can fail
Fault modes - all the way a fault point can failW
What is a fault domain?
Make sure resources are hosted on separate racks in centres
What is load testing vs stress testing?
Load - normal to heavy load tests
Stress - upper limit tests
What is a baseline?
Determine a healthy state - gives a starting point of health of application checks
Makes it easier to determine effects of adding or changing things
How are baselines created?
- Log analytics and metrics explorer - analyze data
- Azure monitor insights - provides recommended metrics & dashboards for Azure services
- App insights - provides same as monitor but for app itself
Dynamic threshold alert advantages
- Machine learning picks up expected behavior against metrics
- Don’t have to manually figure out thresholds
- Can set it and forget it for rules
App insights smart detection
Uses machine learning to detect anomalies with telemetry data
Built in - configured automatically with enough time and data
Creates alert independently using smart detection findings
What can smart detection help with?
Failures/failure rates - figures out expected number of failures. Gives continuous monitoring and provides context. Min data & 24hour telemetry for baseline
Performance - page response testing, does daily analysis in case of false positives. Min data & 8 days telemetry for baseline
What are dependencies?
To HTTP, db or file system calls
Internal vs external - use of libraries (internal), use of APIs (external)
Strong vs weak - strong can cause breaking failure, weak will still run but with limited features
App insights dependency tracking on .NET
Tracking configured by default on .NET and .NET Core
Some need manual configuration
What are app insights dependency data?
App map - of dependencies
Transaction diagnostics - track as they pass through different systems
Browsers - see calls
Log analytics - create custom queries against dep data
App dependencies on VMs
Enable dependency agent on VM
Processes - Shows dependencies looking for servers, inbound/outbound latency, TCP ports
Views -different scopes from Vms, scale sets or from Azure monitor
How often performance alerts analysed?
Once a day in case of false positives (i.e. expected spikes)
What is TFVC?
Team Foundation Version Control
Developers check out the only version of the file on a machine, unlike Git.
what level does Azure Repo work at?
Project level
Supports Git and TFVC
Optional component for Azure pipeline - other repos/version control can be used
What is a submodule?
Add third part libraries as an external resource that can still be integrated
What is scalar?
Git projects struggle to scale at larger sizes
Scalar helps reduce data transfer, cmd runtime, helps index relevant files & organise