Module 5 - Storage and Databases Flashcards
Instance stores
Provides temporary block-level storage for Amazon EC2 instance.
disk storage physically attached to host computer for EC2 instance - has same lifespan as instance
instance terminated - you lose any data in instance store
Amazon Elastic Block Store (Amazon EBS)
service that provides block-level storage volumes that you can use with Amazon EC2 instances.
o Stop/terminate Amazon EC2 instance, all data on attached EBS volume remains available.
o Create EBS volume – define configuration e.g. volume size and type and provision it.
o After you create EBS volume, can attach to Amazon EC2 instance.
o EBS volumes for data that needs to persist – important to back up data.
o Can take incremental backups of EBS volumes by creating Amazon EBS snapshots.
Amazon EBS Snapshots
o First backup taken of a volume copies all data.
o Subsequent backups – only blocks of data that have changed since most recent snapshot saved.
Incremental backups different from full backups
o All data in a storage volume copies each time backup occurs.
o Full backup includes data that has not changed since most recent backup.
Object storage
- Each object consists of data, metadata, and a key.
- Data might be image, video, text document, or any other type of file.
- Metadata contains information about what data is, how it is used, object size, and so on.
- Object’s key is its unique identifier.
- When you modify a file in block storage, only the pieces that are changed are updated.
o When a file in object storage is modified, the entire object is updated.
Amazon Simple Storage Service (Amazon S3)
- Service that provides object-level storage.
- Amazon S3 stores data as objects in buckets.
- Can upload any type of file to Amazon S3 e.g. images, videos, text files etc.
- Offers unlimited storage space.
- Maximum file size for object in Amazon S3 is 5 TB.
- Upload file to Amazon S3 – can set permissions to control visibility and access to it.
- Amazon S3 versioning feature to track changes to objects over time.
Amazon S3 Storage Classes
S3 Standard
S3 Standard-Infrequent Access (S3 Standard-IA)
S3 One Zone-Infrequent Access (S3 One Zone-IA)
S3 Intelligent-Tiering
S3 Glacier
S3 Glacier Deep Archive
S3 Standard
- Designed for frequently accessed data.
- Stores data in a minimum of three Availability Zones.
- Provides high availability for objects.
- Good choice for wide range of use cases – websites, content distribution, data analytics.
- Higher cost than other storage classes intended for infrequently accessed data and archival storage.
S3 Standard-Infrequent Access (S3 Standard-IA)
- Ideal for infrequently accessed data but requires high availability when needed.
- Similar to S3 Standard but has lower storage price and higher retrieval price.
- S3 Standard and S3 Standard-IA store data in minimum of three Availability Zones.
- S3 Standard-IA provides same level of availability as S3 Standard
lower storage price and higher retrieval price.
S3 One Zone-Infrequent Access (S3 One Zone-IA)
- Stores data in single Availability Zone.
- Has lower storage price than S3 Standard-IA.
• Good storage class to consider if following conditions apply: o You want to save costs on storage.
o You can easily reproduce your data in the event of an Availability Zone failure.
S3 Intelligent-Tiering
- Ideal for data with unknown or changing access patterns.
- Requires small monthly monitoring and automation fee per object.
- Amazon S3 monitors objects’ access patterns.
- Haven’t accessed object for 30 consecutive days – Amazon S3 automatically moves it to infrequent access tier: S3 Standard-IA.
- If you access object in infrequent access tier – Amazon S3 automatically moves it to frequent access tier: S3 Standard.
S3 Glacier
- Low-cost storage designed for data archiving.
- Able to retrieve objects within few minutes to hours.
- E.g. store archived customer records, older photos and video files.
S3 Glacier Deep Archive
• Lowest-cost object storage class ideal for archiving.
• Able to retrieve objects within 12 hours.
o Deciding between Amazon S3 Glacier and Amazon S3 Glacier Deep Archive – consider how quickly you need to retrieve archived objects.
File Storage
- Multiple clients e.g. users, applications, servers etc. can access data stored in shared file folders.
- Storage server uses block storage with local file system to organise files.
- Clients access data through file paths.
- File storage ideal for use cases in which large number of services and resources need to access same data at same time.
Amazon Relational Database Service (Amazon RDS)
• Service that enables you to run relational databases in AWS Cloud.
• Managed service that automates tasks e.g. hardware provisioning, database setup, patching, backups.
o Spend less time completing administrative tasks and more time using data to innovate your applications.
• Can integrate Amazon RDS with other services to fulfil business and operational needs.
o E.g. AWS Lambda to query database from serverless application.
Amazon Elastic File System (Amazon EFS)
scalable file system used with AWS Cloud services and on-premises resources.
• Add/remove files – Amazon EFS grows and shrinks automatically.
o Can scale on demand to petabytes without disrupting applications.
Relational Databases
- Data stored in way that relates it to other pieces of data.
- Use structured query language (SQL) to store and query data.
- Approach allows data to be stored in easily understandable, consistent, scalable way.
Amazon RDS Database Engines (optimised for memory, performance, input/output (I/O))
Supported database engines:
Amazon RDS Database Engines (optimised for memory, performance, input/output (I/O))
Supported database engines: Amazon Aurora PostgreSQL MySQL MariaDB Oracle Database Microsoft SQL Server
Amazon Aurora
• Enterprise-class relational database.
• Compatible with MySQL and PostgreSQL relational databases.
• Up to five times faster than standard MySQL databases.
• Up to three times faster than standard PostgreSQL databases.
• Helps reduce database costs by reducing unnecessary input/output (I/O) operations.
o Ensuring database resources remain reliable and available.
• Consider if workloads require high availability.
o Replicates six copies of data across three Availability Zones.
o Continuously backs up data to Amazon S3.
Amazon DynamoDB
- Key-value database service.
- Delivers single-digit millisecond performance at any scale.
• Serverless – do not have to provision, patch or manage servers.
o Do not have to install, maintain or operate software.
- Automatic scaling:
• Automatically adjust for changes in capacity while maintaining consistent performance.
o Suitable choice for use cases that require high performance while scaling.
Nonrelational Databases
• Create tables.
o Table is place where you can store and query data.
• Sometimes referred to as “NoSQL databases” use structures other than rows and columns to organise data.
• Type of structural approach: key-value pairs.
o Data organised into items (keys) and items have attributes (values).
o Attributes as being different features of data.
o Can add/remove attributes from items in table at any time.
o Not every item in table has to have same attributes.
Amazon Redshift
- Data warehousing service that you can use for big data analytics.
- Offers ability to collect data from many sources.
- Helps you to understand relationships and trends across your data.
AWS Database Migration Service (AWS DMS)
• Enables you to migrate relational databases, nonrelational databases and other types of data stores.
• Move data between source database and target database.
o Source and target databases can be of same type or different types.
o During migration – source database remains operational, reducing downtime for any applications that rely on database.
Other Use Cases for AWS DMS
• Development and test database migrations.
o Enabling developers to test applications against production data without affecting production users.
• Database consolidation.
o Combining several databases into a single database.
• Continuous replication.
o Sending ongoing copies of your data to other target sources instead of doing a one-time migration.
Additional Database Services
Amazon DocumentDB
Amazon Neptune
Amazon Quantum Ledger Database (Amazon QLDB)
Amazon Managed Blockchain
Amazon ElastiCache
Amazon DynamoDB Accelerator
Amazon DocumentDB
o Document database service that supports MongoDB workloads.
o MongoDB – document database program.
Amazon Neptune
o Graph database service.
o Use to build and run applications that work with highly connected datasets.
o E.g. recommendation engines, fraud detection, knowledge graphs.
Amazon Quantum Ledger Database (Amazon QLDB)
o Ledger database service.
o Can use to review complete history of all the changes that have been made to your application data.
Amazon Managed Blockchain
o Service use to create and manage blockchain networks with open-source frameworks.
o Distributed ledger system that lets multiple parties run transactions and share data without central authority.
Amazon ElastiCache
o Service that adds caching layers on top of your databases to help improve read times of common requests.
o Supports two types of data stores.
Redis.
Memcached.
Amazon DynamoDB Accelerator
o In-memory cache for DynamoDB.
o Helps improve response times from single-digit milliseconds to microseconds.