Storage and Databases Flashcards
AWS Snowball Edge data migration
Enable Petabyte scale offline data migration fron onpremises storage for databases into amazon s3
gp2 and gp3 EBS volume types
General purpose SSD that balances price and performance.
Can be used as boot volumes
io1 and io2 Block Express EBS volume types
Highest performance SSD volumes for mission critical low latency or high throughput.
Can be used as boot volumes
st1 EBS volume types
Low cost HDD volume for frequently accessed, throughput intensive workloads
sc1 EBS volume types
Lowest cost HDD volume designed for less frequently accessed workloads
EBS Snapshots
- Are incremental and use IO (shouldn’t be runned while app is handling a lot of traffic).
- Not necessary to detach volume.
- Can copy across regions.
Amazon Data Lifecycle Manager
Automate creation, retention and deletion of EBS snapshots and EBS backed AMIs.
Uses tags to identify resources.
EBS Multi Attach
Only avaible for io1/io2 family. Attach the same volume to multiple instances in the same AZ.
Amazon EFS
Managed NFS than can be mounted on Linux multi-AZ EC2 & on premises.
Can only attach to one VPC.
EFS perfomance mode
- General Purpose (default): latency sensitive use cases
- Max I/O: higher latency, throughput, highly parallel
EFS Throughput Mode
- Bursting: scales with the amount of storage in your file system
- Provisioned: if you know your workload’s performance requirements, regardless of storage size
- Elastic: Automatically scales based on your workload
EFS storage classes
- Standard: frequently accessed files, high performance
- Infrequent access: cost to retrieve files, lower price to store
- Archive: rarely accessed data (50% cheaper)
EFS Access Points
Application-specific entry points that make it easier to manage access to shared datasets.
Enforce user identity, clients can only access data in the specified directory or its subdirectories.
EFS Cross Region Replication
Can be setup for new or existing EFS.
1. Provides RPO and RTO of minutes
2. Doesn’t affect the provisioned throughput
Storage Classes
S3 Standard
- General purpose storage for frequently accessed data
- Low latency and high throughput performance
Storage Classes
S3 Intelligent-Tiering
- Automatic cost savings for data with unknown or changing access patterns
- Opt-in asynchronous archive capabilities for objects that become rarely accessed
- Small monthly monitoring and automation charge
- No operational overhead, no lifecycle charges, no retrieval charges, and no minimum storage duration
Storage Classes
S3 Express One Zone
- High performance storage for your most frequently accessed data
- Consistent single-digit millisecond request latency
- Improve access speeds by 10x and reduce request costs by 50% compared to S3 Standard
Storage Classes
S3 Standard Infrequent Access
- Infrequently accessed data that needs millisecond access
- Same low latency and high throughput performance of S3 Standard
Storage Classes
S3 One Zone-Infrequent Access
- Re-creatable infrequently accessed data
- Same low latency and high throughput performance of S3 Standard
Storage Classes
S3 Glacier Instant Retrieval
- Long-lived data that is accessed a few times per year with instant retrievals
- Data retrieval in milliseconds with the same performance as S3 Standard
Storage Classes
S3 Glacier Flexible Retrieval
- Backup and archive data that is rarely accessed and low cost
- Ideal for backup and disaster recovery use cases when large sets of data occasionally need to be retrieved in minutes, without concern for costs
- Configurable retrieval times, from minutes to hours, with free bulk retrievals
Storage Classes
S3 Glacier Deep Archive
- Archive data that is very rarely accessed and very low cost
- Ideal alternative to magnetic tape libraries
- Retrieval time within 12 hours
S3 Replication
Versioning must be enabled
* Cross region replication
* Same region replication
* S3 replication time control: Replicates most objects in seconds
S3 Baseline Performance
3500 PUT and 5500 GET requests per second per prefix in a bucket
S3 multi-part upload
Upload object parts independently, and in any order. If transmission of any part fails, you can retransmit that part without affecting other parts.
Must use for files bigger than 5GB.
S3 Transfer Acceleration
Increase transfer speed by using AWS edge location which will forward the data to the bucket
S3 Byte Range Fetches
Parallelize GETs by requesting specific byte ranges. Better resilience in case of failures.
S3 Analytics
Storage Class Analysis
Help you decide when to transition objects to Standard and Standard IA. Report is updated daily.
S3 Storage Lens
Discover anomalies, identify and apply cost efficiencies, and apply data protection best practices across entire AWS Organization.
S3 Storage Lens - Metrics
- Summary Metrics
- Cost Optimization Metrics
- Data Protection Metrics
- Access Management Metrics
- Event Metrics
- Performance Metrics
- Activity Metrics
- Detailed Status Code Metrics
S3 Storage Lens - Free vs Paid
Free metrics:
1. Available for all customers
2. Data is available for 14 days
Advanced metrics:
1. CloudWatch publishing
2. Prefix aggregation
3. Data is available for 15 months
Amazon FSx
Launch 3rd party high performance file systems
* You cannot decrease storage capacity. use DataSync to migrate to smaller FSx.
Amazon FSx for Windows
- Supports SMB protocol & Windows NTFS
- Integration with AD, ACLs and user quotas
- Can be mounted on Linux EC2 instances
- Can group files acros multiple FS with Microsoft DFS Namespaces.
- Can be accessed from on-premises
- Can be configured MultiAZ
- Data is backed up daily to S3
Amazon FSx for Lustre
- Lustre is derived from Linux and cluster
- Used for High Performance Computing
- Seamless integration with S3
- Can be used from on-premises
- Data Lazy Loading: Only the data that is processed is loaded
FSx File System Deployment Options
Scratch File System: Temporary storage with high burst
Persisten File System: Data is replicated within same AZ
Amazon FSx for NetApp ONTAP
- Compatible with NFS, SMB and iSCSI protocol
- Move workloads to AWS
- Storage shrinks or grows automatically
- Point in time instanteneous cloning
Amazon FSx for OpenZFS
- Compatible with NFS
- Move workloads to AWS
- Point in time instanteneous cloning
AWS DataSync
Replicate large amount of data hourly, daily or weekly to or from:
* S3
* EFS
* FSx
File permissions and metadata are preserved
AWS Data Exchange
Find third party data in the cloud, subscribe to, load data into S3/Redshift, and use analyze it
AWS Transfer Family
File transfers into and out of S3 or EFS using FTP, FTPS or SFTP protocols.
AWS Transfer Family - Endpoint Types
- Public endpoints: IPs managed by AWS subject to change
- VPC Endpoint with Internal access: Static private IPs, set allow lists (SGs & NACL)
- VPC Endpoint with internet facing access: Static private and public IPs (EIPs)
DynamoDB Indexes
- Local secondary index: The same primary key and an alternative sort key. Must be defined at table creation time.
- Global secondary index: Change the primary key and optional sort key. Can be defined after the table is created.
DynamoDB Streams
React to chenges to DynamoDB tables in real time with AWS Lambdas or EC2.
DynamoDB Accelerator (DAX)
Seamless cache, microsecond latency for reads & queries.
Amazon OpenSearch
Provide search and indexing capabilites
RDS Engines
- PostgreSQL
- MySQL
- MariaDB
- IBM DB2
- Oracle
- SQL Server
RDS Multi AZ
- Synchronous replication
- Highly Durable
- Spans 2 AZ within a region
- Standby instance for failover in case of outage. One DNS name for writes and reads that** automatically faiolvers**.
- Used for Disaster Recovery
RDS Read Replicas
- Increase read throughput and enhance the performance.
- Asynchronous replication, eventual consistency
- Cross-AZ or Cross-region
- Manually promote to stand-alone DB
- Distribute reads across replicas with Route53 weighted record set
With which motors does RDS IAM Authentication works?
- Works with MariaDB, MySQL and PostgreSQL
RDS Oracle backups
- RDS backups to restore to Amazon RDS for Oracle
- Oracle RMAN (Recovery Manager) to restore to non RDS.
RDS proxy
Allows applications to pool and share connections established with the database. Instead of opening a connection per client, it will reuse them.
Real Appliation Clusters (RAC)
- RDS for Oracle does not support
RDS
mysqldump
Migrate a MySQL RDS to non RDS.
Aurora
- Compatible with PostgreSQL and MySQL
- Automatically grows up to 128 TB
- Reader endpoint to access up to 15 read replicas
- Cross region read replicas copies the entire DB
Aurora High Availability
It has 6 copies across 3 AZ:
1. 4 copies needed for writes
2. 3 copies needed for reads
3. Self healing
Amazon Aurora global databases
Set up multiple read Aurora clusters that span up to 5 regions (automatic sincronizing changes made in primary cluster)
Aurora global write forwarding
Enables secondary cluster to forward SQL statements that perform write operations to the primary cluster
Aurora Endpoints
- Cluster endpoint: writer endpoint that connects to the current primary DB instance
- Reader enpoint: provides load balancing for read connections to all replicas
- Custom endpoints: instances that you choose in the cluster
- Instance endpoint
Aurora Serverless
Automated database instantiation and auto scaling. Good for infrequent, intermittent or unpredictable workloads.
Convert RDS to Aurora
- Create an RDS snapshot an restore to an Aurora instance
- Create and Aurora Read Replica from an RDS instance and promote it to an Aurora instance
S3 Requester Pays
The requester pays the cost of the request and the data download from the bucket.
S3 Durability
11 9s
Same for all storage classes
Which Cache supports complex data types?
Redis supports complex data as string, hashes, lists, sets and bitmaps
When to use memcached?
- Simplest model possible
- Run large nodes with multiple cores or threads
- Cache objects such as databases
Elasticache
Redis Cluster Mode
Horizontal scaling multi-az up and down of your Redis cluster, with almost zero impact on the performance. It partitions up to 90 shards.