05 - Database Flashcards
RDS Backups
• Backups are automatically enabled in RDS
1) Automated backups:
• Daily full backup of the database (during the maintenance window)
• Transaction logs are backed-up by RDS every 5 minutes
• => ability to restore to any point in time (from oldest backup to 5 minutes ago)
• 7 days retention (can be increased to 35 days)
2) DB Snapshots:
• Manually triggered by the user
• Retention of backup for as long as you want
RDS – Storage Auto Scaling
• Helps you increase storage on your RDS DB instance dynamically
• When RDS detects you are running out of free database storage, it scales automatically
• Avoid manually scaling your database storage
• You have to set Maximum Storage Threshold (maximum limit for DB storage)
• Automatically modify storage if:
- Free storage is less than 10% of allocated storage
- Low-storage lasts at least 5 minutes
- 6 hours have passed since last modification
• Useful for applications with unpredictable workloads
• Supports all RDS database engine
RDS Read Replicas for read scalability
• In AWS there’s a network cost when data goes from one AZ to another
• For RDS Read Replicas within the same region, you don’t pay that fee
- Up to 5 Read Replicas
- Within AZ, Cross AZ or Cross Region
- Replication is ASYNC, so reads are eventually consistent
- Replicas can be promoted to their own DB
- Applications must update the connection string to leverage read replicas
RDS Multi AZ (Disaster Recovery)
• SYNC replication
• One DNS name – automatic app failover to standby
• Increase availability
- Failover in case of loss of AZ, loss of network, instance or storage failure
- No manual intervention in apps
- Not used for scaling
- Multi-AZ replication is free
- Note:The Read Replicas be setup as Multi AZ for Disaster Recovery (DR)
RDS - IAM Authentication
• IAM database authentication works with MySQL and PostgreSQL
1) You don’t need a password, just an authentication token obtained through IAM & RDS API calls
2) Auth token has a lifetime of 15 minutes
3) Benefits:
• Network in/out must be encrypted using SSL
• IAM to centrally manage users instead of DB
• Can leverage IAM Roles and EC2 Instance profiles for easy integration
RDS Security – Summary
1) Encryption at rest:
• Is done only when you first create the DB instance
• or: unencrypted DB => snapshot => copy snapshot as encrypted => create DB from snapshot
2) Your responsibility:
• Check the ports / IP / security group inbound rules in DB’s SG
• In-database user creation and permissions or manage through IAM
• Creating a database with or without public access
• Ensure parameter groups or DB is configured to only allow SSL connections
3) AWS responsibility: • No SSH access • No manual DB patching • No manual OS patching • No way to audit the underlying instance
Amazon Aurora
• Aurora is a proprietary technology from AWS (not open sourced)
• Postgres and MySQL are both supported as Aurora DB
- Aurora is “AWS cloud optimized” and claims 5x performance improvement over MySQL on RDS, over 3x the performance of Postgres on RDS
- Aurora storage automatically grows in increments of 10GB, up to 128 TB.
- Aurora can have 15 replicas while MySQL has 5, and the replication process is faster (sub 10 ms replica lag)
- Failover in Aurora is instantaneous. It’s HA (High Availability) native.
- Aurora costs more than RDS (20% more) – but is more efficient
Aurora High Availability and Read Scaling
1) 6 copies of your data across 3 AZ:
• 4 copies out of 6 needed for writes
• 3 copies out of 6 need for reads
• Self healing with peer-to-peer replication
• Storage is striped across 100s of volumes
2) One Aurora Instance takes writes (master)
3) Automated failover for master in less than 30 seconds
4) Master + up to 15 Aurora Read Replicas serve reads
5) Support for Cross Region Replication
Aurora Serverless
- Automated database instantiation and autoscaling based on actual usage
- Good for infrequent, intermittent or unpredictable workloads
- No capacity planning needed
- Pay per second, can be more cost-effective
Aurora Multi-Master
- In case you want immediate failover for write node (HA) –
* Every node does R/W - vs promoting a RR as the new master
Global Aurora
1) Aurora Cross Region Read Replicas:
• Useful for disaster recovery
• Simple to put in place
2) Aurora Global Database (recommended):
• 1 Primary Region (read / write)
• Up to 5 secondary (read-only) regions, replication lag is less than 1 second
• Up to 16 Read Replicas per secondary region
• Helps for decreasing latency
• Promoting another region (for disaster recovery) has an RTO of < 1 minute
Aurora Machine Learning
• Enables you to add ML-based predictions to your applications via SQL
1) Simple, optimised, and secure integration between Aurora and AWS ML services
2) Supported services
• Amazon SageMaker (use with any ML model)
• Amazon Comprehend (for sentiment analysis)
3) You don’t need to have ML experience
4) Use cases: fraud detection, ads targeting, sentiment analysis, product recommendations
ElastiCache – Redis vs Memcached
REDIS • Multi AZ with Auto-Failover • Read Replicas to scale reads and have high availability • Data Durability using AOF persistence • Backup and restore features
MEMCACHED • Multi-node for partitioning of data (sharding) • No high availability (replication) • Non persistent • No backup and restore • Multi-threaded architecture
ElastiCache – Redis Use Case
- Gaming Leaderboards are computationally complex
- Redis Sorted Sets guarantee both uniqueness and element ordering
- Each time a new element added, it’s ranked in real time, then added in correct order
Amazon DynamoDB
• Fully managed, highly available with replication across multiple AZs
• NoSQL database - not a relational database
- Scales to massive workloads, distributed database
- Millions of requests per seconds, trillions of row, 100s of TB of storage
- Fast and consistent in performance (low latency on retrieval)
- Integrated with IAM for security, authorization and administration
- Enables event driven programming with DynamoDB Streams
- Low cost and auto-scaling capabilities
DynamoDB – Read/Write Capacity Modes
• Control how you manage your table’s capacity (read/write throughput)
Provisioned Mode (default)
• You specify the number of reads/writes per second
• You need to plan capacity beforehand
• Pay for provisioned Read Capacity Units (RCU) & Write Capacity Units (WCU)
• Possibility to add auto-scaling mode for RCU & WCU
On-Demand Mode
• Read/writes automatically scale up/down with your workloads
• No capacity planning needed
• Pay for what you use, more expensive ($$$)
• Great for unpredictable workloads
DynamoDB Accelerator (DAX)
- Fully-managed, highly available, seamless in-memory cache for DynamoDB
- Help solve read congestion by caching
- Microseconds latency for cached data
- Doesn’t require application logic modification (compatible with existing DynamoDB APIs)
- 5 minutes TTL for cache (default)
DynamoDB Global Tables
- Make a DynamoDB table accessible with low latency in multiple-regions
- Active-Active replication
- Applications can READ and WRITE to the table in any region
- Must enable DynamoDB Streams as a pre-requisite
RDS Overview
• Managed PostgreSQL / MySQL / Oracle / SQL Server
• Must provision an EC2 instance & EBS Volume type and size
• Support for Read Replicas and Multi AZ
• Security through IAM, Security Groups, KMS , SSL in transit
• Backup / Snapshot / Point in time restore feature
• Managed and Scheduled maintenance
• Monitoring through CloudWatch
• Use case: Store relational datasets (RDBMS / OLTP), perform SQL queries,
transactional inserts / update / delete is available
RDS for Solutions Architect
- Operations: small downtime when failover happens, when maintenance happens, scaling in read replicas / ec2 instance / restore EBS implies manual intervention, application changes
- Security: AWS responsible for OS security, we are responsible for setting up KMS, security groups, IAM policies, authorizing users in DB, using SSL
- Reliability: Multi AZ feature, failover in case of failures
- Performance: depends on EC2 instance type, EBS volume type, ability to add Read Replicas. Storage auto-scaling & manual scaling of instances
- Cost: Pay per hour based on provisioned EC2 and EBS
Aurora Overview
- Compatible API for PostgreSQL / MySQL
- Data is held in 6 replicas, across 3 AZ
- Auto healing capability
- Multi AZ, Auto Scaling Read Replicas
- Read Replicas can be Global
- Aurora database can be Global for DR or latency purposes
- Auto scaling of storage from 10GB to 128 TB
- Define EC2 instance type for aurora instances
- Same security / monitoring / maintenance features as RDS
- Aurora Serverless – for unpredictable / intermittent workloads
- Aurora Multi-Master – for continuous writes failover
- Use case: same as RDS, but with less maintenance / more flexibility / more performance
Aurora for Solutions Architect
- Operations: less operations, auto scaling storage
- Security: AWS responsible for OS security, we are responsible for setting up KMS, security groups, IAM policies, authorizing users in DB, using SSL
- Reliability: Multi AZ, highly available, possibly more than RDS, Aurora Serverless option, Aurora Multi-Master option
- Performance: 5x performance (according to AWS) due to architectural optimizations. Up to 15 Read Replicas (only 5 for RDS)
- Cost: Pay per hour based on EC2 and storage usage. Possibly lower costs compared to Enterprise grade databases such as Oracle
ElastiCache Overview
- Managed Redis / Memcached (similar offering as RDS, but for caches)
- In-memory data store, sub-millisecond latency
- Must provision an EC2 instance type
- Support for Clustering (Redis) and Multi AZ, Read Replicas (sharding)
- Security through IAM, Security Groups, KMS, Redis Auth
- Backup / Snapshot / Point in time restore feature
- Managed and Scheduled maintenance
- Monitoring through CloudWatch
- Use Case: Key/Value store, Frequent reads, less writes, cache results for DB queries, store session data for websites, cannot use SQL.
ElastiCache for Solutions Architect
- Operations: same as RDS
- Security: AWS responsible for OS security, we are responsible for setting up KMS, security groups, IAM policies, users (Redis Auth), using SSL
- Reliability: Clustering, Multi AZ
- Performance: Sub-millisecond performance, in memory, read replicas for sharding, very popular cache option
- Cost: Pay per hour based on EC2 and storage usage