05 - Database Flashcards
RDS Backups
• Backups are automatically enabled in RDS
1) Automated backups:
• Daily full backup of the database (during the maintenance window)
• Transaction logs are backed-up by RDS every 5 minutes
• => ability to restore to any point in time (from oldest backup to 5 minutes ago)
• 7 days retention (can be increased to 35 days)
2) DB Snapshots:
• Manually triggered by the user
• Retention of backup for as long as you want
RDS – Storage Auto Scaling
• Helps you increase storage on your RDS DB instance dynamically
• When RDS detects you are running out of free database storage, it scales automatically
• Avoid manually scaling your database storage
• You have to set Maximum Storage Threshold (maximum limit for DB storage)
• Automatically modify storage if:
- Free storage is less than 10% of allocated storage
- Low-storage lasts at least 5 minutes
- 6 hours have passed since last modification
• Useful for applications with unpredictable workloads
• Supports all RDS database engine
RDS Read Replicas for read scalability
• In AWS there’s a network cost when data goes from one AZ to another
• For RDS Read Replicas within the same region, you don’t pay that fee
- Up to 5 Read Replicas
- Within AZ, Cross AZ or Cross Region
- Replication is ASYNC, so reads are eventually consistent
- Replicas can be promoted to their own DB
- Applications must update the connection string to leverage read replicas
RDS Multi AZ (Disaster Recovery)
• SYNC replication
• One DNS name – automatic app failover to standby
• Increase availability
- Failover in case of loss of AZ, loss of network, instance or storage failure
- No manual intervention in apps
- Not used for scaling
- Multi-AZ replication is free
- Note:The Read Replicas be setup as Multi AZ for Disaster Recovery (DR)
RDS - IAM Authentication
• IAM database authentication works with MySQL and PostgreSQL
1) You don’t need a password, just an authentication token obtained through IAM & RDS API calls
2) Auth token has a lifetime of 15 minutes
3) Benefits:
• Network in/out must be encrypted using SSL
• IAM to centrally manage users instead of DB
• Can leverage IAM Roles and EC2 Instance profiles for easy integration
RDS Security – Summary
1) Encryption at rest:
• Is done only when you first create the DB instance
• or: unencrypted DB => snapshot => copy snapshot as encrypted => create DB from snapshot
2) Your responsibility:
• Check the ports / IP / security group inbound rules in DB’s SG
• In-database user creation and permissions or manage through IAM
• Creating a database with or without public access
• Ensure parameter groups or DB is configured to only allow SSL connections
3) AWS responsibility: • No SSH access • No manual DB patching • No manual OS patching • No way to audit the underlying instance
Amazon Aurora
• Aurora is a proprietary technology from AWS (not open sourced)
• Postgres and MySQL are both supported as Aurora DB
- Aurora is “AWS cloud optimized” and claims 5x performance improvement over MySQL on RDS, over 3x the performance of Postgres on RDS
- Aurora storage automatically grows in increments of 10GB, up to 128 TB.
- Aurora can have 15 replicas while MySQL has 5, and the replication process is faster (sub 10 ms replica lag)
- Failover in Aurora is instantaneous. It’s HA (High Availability) native.
- Aurora costs more than RDS (20% more) – but is more efficient
Aurora High Availability and Read Scaling
1) 6 copies of your data across 3 AZ:
• 4 copies out of 6 needed for writes
• 3 copies out of 6 need for reads
• Self healing with peer-to-peer replication
• Storage is striped across 100s of volumes
2) One Aurora Instance takes writes (master)
3) Automated failover for master in less than 30 seconds
4) Master + up to 15 Aurora Read Replicas serve reads
5) Support for Cross Region Replication
Aurora Serverless
- Automated database instantiation and autoscaling based on actual usage
- Good for infrequent, intermittent or unpredictable workloads
- No capacity planning needed
- Pay per second, can be more cost-effective
Aurora Multi-Master
- In case you want immediate failover for write node (HA) –
* Every node does R/W - vs promoting a RR as the new master
Global Aurora
1) Aurora Cross Region Read Replicas:
• Useful for disaster recovery
• Simple to put in place
2) Aurora Global Database (recommended):
• 1 Primary Region (read / write)
• Up to 5 secondary (read-only) regions, replication lag is less than 1 second
• Up to 16 Read Replicas per secondary region
• Helps for decreasing latency
• Promoting another region (for disaster recovery) has an RTO of < 1 minute
Aurora Machine Learning
• Enables you to add ML-based predictions to your applications via SQL
1) Simple, optimised, and secure integration between Aurora and AWS ML services
2) Supported services
• Amazon SageMaker (use with any ML model)
• Amazon Comprehend (for sentiment analysis)
3) You don’t need to have ML experience
4) Use cases: fraud detection, ads targeting, sentiment analysis, product recommendations
ElastiCache – Redis vs Memcached
REDIS • Multi AZ with Auto-Failover • Read Replicas to scale reads and have high availability • Data Durability using AOF persistence • Backup and restore features
MEMCACHED • Multi-node for partitioning of data (sharding) • No high availability (replication) • Non persistent • No backup and restore • Multi-threaded architecture
ElastiCache – Redis Use Case
- Gaming Leaderboards are computationally complex
- Redis Sorted Sets guarantee both uniqueness and element ordering
- Each time a new element added, it’s ranked in real time, then added in correct order
Amazon DynamoDB
• Fully managed, highly available with replication across multiple AZs
• NoSQL database - not a relational database
- Scales to massive workloads, distributed database
- Millions of requests per seconds, trillions of row, 100s of TB of storage
- Fast and consistent in performance (low latency on retrieval)
- Integrated with IAM for security, authorization and administration
- Enables event driven programming with DynamoDB Streams
- Low cost and auto-scaling capabilities