Cloud Monitoring Section Flashcards
Amazon CloudWatch Metrics
- Provides metrics with timestamps for every AWS service
* Can create CloudWatch dashboards of metrics
Metrics in AWS
• EC2 instances: CPU Utilization, Status Checks, Network (not RAM)
Default metrics every 5 minutes
Option for Detailed Monitoring ($$$): metrics every 1 minute
• EBS volumes: Disk Read/Writes
• S3 buckets: BucketSizeBytes, NumberOfObjects, AllRequests
• Billing:Total Estimated Charge (only in us-east-1)
• Service Limits: how much you’ve been using a service API
• Custom metrics: push your own metrics
Amazon CloudWatch Alarms
• Alarms are used to trigger notifications for any metric
Alarms actions:
• Auto Scaling: increase or decrease EC2 instances “desired” count
• EC2 Actions: stop, terminate, reboot or recover an EC2 instance
• SNS notifications: send a notification into an SNS topic
• Alarm States: OK. INSUFFICIENT_DATA, ALARM
Amazon CloudWatch Logs
Can be collected from:
• Elastic Beanstalk: collection of logs from application
• ECS: collection from containers
• AWS Lambda: collection from function logs
• CloudTrail based on filter
• CloudWatch log agents: on EC2 machines or on-premises servers
• Route53: Log DNS queries
CloudWatch Logs on EC2
- By default, no logs from your EC2 instance will go to CloudWatch
- You need to run a CloudWatch agent on EC2 to push the log files you want
- Make sure IAM permissions are correct
- The CloudWatch log agent can be setup on-premises too
Amazon CloudWatch Events
- Schedule: Cron jobs (scheduled scripts)
- Event Pattern: Event rules to react to a service doing something
- Trigger Lambda functions, send SQS/SNS messages…
Amazon EventBridge
- EventBridge is the next evolution of CloudWatch Events
- Default event bus: generated by AWS services (CloudWatch Events)
- Partner event bus: receive events from SaaS service or applications (Zendesk, DataDog, Segment, Auth0…)
- Custom Event buses: for your own applications
- Schema Registry: model event schema
AWS CloudTrail
- Provides governance, compliance and audit for your AWS Account
- CloudTrail is enabled by default!
- Get an history of events / API calls made within your AWS Account by:
- Console
- SDK
- CLI
- AWS Services
- Can put logs from CloudTrail into CloudWatch Logs or S3
- A trail can be applied to All Regions (default) or a single Region
CloudTrail Events - Management Events
• Operations that are performed on resources in your AWS account
Examples:
• Configuring security (IAM AttachRolePolicy)
• Configuring rules for routing data (Amazon EC2 CreateSubnet)
• Setting up logging (AWS CloudTrail CreateTrail)
- By default, trails are configured to log management events.
- Can separate Read Events (that don’t modify resources) from Write Events (that may modify resources)
CloudTrail Events - Data Events
- By default, data events are not logged (because high volume operations)
- Amazon S3 object-level activity (ex: GetObject, DeleteObject, PutObject): can separate Read and Write Events
- AWS Lambda function execution activity (the Invoke API)
CloudTrail Events - Cloudtrail Insights Events
Enable CloudTrail Insights to detect unusual activity in your account: • inaccurate resource provisioning • hitting service limits • Bursts of AWS IAM actions • Gaps in periodic maintenance activity
• CloudTrail Insights analyzes normal management events to create a baseline
- And then continuously analyzes write events to detect unusual patterns:
- Anomalies appear in the CloudTrail console
- Event is sent to Amazon S3
- An EventBridge event is generated (for automation needs)
CloudTrail Events Retention
- Events are stored for 90 days in CloudTrail
* To keep events beyond this period, log them to S3 and use Athena
AWS X-Ray
- Visual analysis of application
- Troubleshooting performance (bottlenecks)
- Understand dependencies in a microservice architecture
- Pinpoint service issues
- Review request behavior
- Find errors and exceptions
- Are we meeting time SLA?
- Where I am throttled?
- Identify users that are impacted
Amazon CodeGuru
• An ML-powered service for automated code reviews and application performance recommendations
Provides two functionalities:
• CodeGuru Reviewer: automated code reviews for static code analysis (development)
• CodeGuru Profiler: visibility/recommendations about application performance during
runtime (production)
Amazon CodeGuru Reviewer
• Identify critical issues, security vulnerabilities, and hard-to-find bugs
• Example: common coding best practices, resource leaks, security detection, input
validation
• Uses Machine Learning and automated reasoning
• Hard-learned lessons across millions of code reviews on 1000s of open-source
and Amazon repositories
• Supports Java and Python
• Integrates with GitHub, Bitbucket, and AWS CodeCommit