L7 - Cloud Monitoring Flashcards
Why monitor?
To make best use of your rented resources to reduce your costs and increase satisfaction of the users of your service
observable system
one that exposes enough data about itself so that generating information (finding answers to questions yet to be formulated) and easily accessing this information becomes simple
monitoring
process of collecting status information of applicaitons and resources; the data can be used to observe application and infrastructure
monitoring system
consists of all components for gathering monitoring data at runtime
2 ways to create information
- proactively: through continuous analysis for triggering alarms or to give an overview of the status of the system
- reactively: triggering through events such as incidents (e.g. root cause analysis and autoscaling)
What is the purpose of monitoring at the infrastructure level?
- resource management
- incident detection
- root cause analysis
- metering for payment
- auditing
What is the purpose of monitoring at application level?
- performance analysis
- resource management
- failure detection and resolution
- SLA verification
- auditing
How does monitoring take place in a parallel system?
- batch system
- data are collected during an application run
- analysis happens post mortem
- execution is reproducable
How does monitoring take place in the cloud?
- interactive system
- data are continuously produced - realtime data
- realtime analysis
- data used for immediate action or to study past system behavior
3 pillars of monitoring
- metrics
- logs
- traces
4 important metrics in monitoring
- latency
- throughput or traffic
- error rate
- utilization or saturation
What is latency
- time it takes to service a request
- selectively measures successful or error requests
What is throughput or traffic?
- web services: requests/second
- streaming system: network I/O rate or concurrent sessions
- database: transaction/second or retrievals per second
What is the error rate?
- rate of requests that fail
What is utilization or saturation?
- percentage of capacity
- CPU, memory, I/O