Cloud Monitoring Flashcards
Amazon CloudWatch
Is a monitoring and observability service built
• Metrics: monitor the performance of AWS services and billing metrics
• Alarms: automate notification, perform EC2 action, notify to SNS based on metric
• Logs: collect log files from EC2 instances, servers, Lambda functions…
CloudWatch Metrics
• CloudWatch provides metrics for every services in AWS
• Metric is a variable to monitor
• Metrics have timestamps
• Can create CloudWatch dashboards of metrics
Important CloudWatch Metrics
• EC2 instances: CPU Utilization, Status Checks, Network (not RAM)
• EBS volumes: Disk Read/Writes
• S3 buckets: BucketSizeBytes, NumberOfObjects, AllRequests
• Billing: Total Estimated Charge (only in us-east-1)
• Service Limits: how much you’ve been using a service API
• Custom metrics: push your own metrics
CloudWatch Alarms
Alarms are used to trigger notifications for any metric
• Can choose the period on which to evaluate an alarm
• Alarm States: OK. INSUFFICIENT_DATA, ALARM
Alarms actions…
• Auto Scaling: increase or decrease EC2 instances “desired” count
• EC2 Actions: stop, terminate, reboot or recover an EC2 instance
• SNS notifications: send a notification into an SNS topic
CloudWatch Logs (Hybrid)
Enables real-time monitoring of logs
• Adjustable CloudWatch Logs retention
• CloudWatch Logs can collect log from:
• Elastic Beanstalk: collection of logs from application
• ECS: collection from containers
• AWS Lambda: collection from function logs
• CloudTrail based on filter
• CloudWatch log agents: on EC2 machines or on-premises servers
• Route53: Log DNS querie
Amazon EventBridge (formerly CloudWatch Events)
Serverless event bus that makes it easier to build event-driven applications at scale using events generated from your applications
• Schedule: Cron jobs (scheduled scripts)
• Event Pattern: Event rules to react to a service doing something
• Trigger Lambda functions, send SQS/SNS messages
Amazon EventBridge
• Schema Registry: model event schema
• You can archive events (all/filter) sent to an event bus (indefinitely or set period)
• Ability to replay archived events
AWS CloudTrail
Provides governance, compliance and audit for your AWS Account
• Get an history of events / API calls made within your AWS Account
• Can put logs from CloudTrail into CloudWatch Logs or S3
• A trail can be applied to All Regions (default) or a single Region.
CloudTrail Events
• Management Events= Provide information about management operations that are performed on resources, like Registering devices and Configuring security.
• Data Events = Provide information about the resource operations performed on or in a resource, like Amazon S3 object-level API activity and AWS Lambda function execution activity.
• CloudTrail Insights Events
CloudTrail Insights
Enable CloudTrail Insights to detect unusual activity in your account
• inaccurate resource provisioning
• hitting service limits
• Bursts of AWS IAM actions
• Gaps in periodic maintenance activity
• CloudTrail Insights analyzes normal management events to create a baseline
• And then continuously analyzes write events to detect unusual patterns
CloudTrail Events Retention (Days)
• Events are stored for 90 days in CloudTrail
• To keep events beyond this period, log them to S3 and use Athena
AWS X-Ray
Helps developers analyze and debug production, distributed applications, such as those built using a microservices architecture.
• Troubleshooting performance (bottlenecks)
• Understand dependencies in a microservice architecture
• Pinpoint service issues • Review request behavior
• Find errors and exceptions
• Are we meeting time SLA? • Where I am throttled?
• Identify users that are impacted
Amazon CodeGuru
A Machine Learning powered service for automated code reviews and application performance recommendations
• CodeGuru Reviewer: automated code reviews for static code analysis (development)
• CodeGuru Profiler: visibility/recommendations about application performance during
runtime (production)
Amazon CodeGuru Reviewer
Identify critical issues, security vulnerabilities, and hard-to-find bugs
• Integrates with GitHub, Bitbucket, and
AWS CodeCommit
Amazon CodeGuru Profiler
Helps understand the runtime behavior of your
application
• Support applications running on AWS or on- premise
Features:
• Identify and remove code inefficiencies
• Improve application performance (e.g., reduce CPU utilization)
• Decrease compute costs
• Provides heap summary (identify which objects using up memory)
• Anomaly Detection