Storage Flashcards
EFS Performance Modes
Max I/O
General Purpose
What are AWS services that allow you to share file system across multiple EC2 instances?
Amazon EFS
Amanzon FSx for Windows
Amazon FSx for Lustre
How long does it take to access data in
a. AWS Glacier
b. Glacier Deep Archive?
A. A few minutes to hours
B. Less than 12 hours
Amazon S3 One Zone-Infrequent Access (S3 One Zone-IA)
Name typical use cases
S3 One Zone-IA stores data in a single AZ and costs 20% less than S3 Standard-IA
good choice for storing secondary backup copies of on-premises data or easily re-creatable data
Amazon S3 Intelligent-Tiering (S3 Intelligent-Tiering)
use when data access patterns change, without performance impact or operational overhead
S3 Intelligent-Tiering monitors access patterns and then moves objects that have not been accessed in 30 consecutive days to the Infrequent Access tier.
Once you have activated one or both of the Archive Access tiers, S3 Intelligent-Tiering will move objects that haven’t been accessed for 90 consecutive days to the Archive Access tier and then after 180 consecutive days of no access to the Deep Archive Access tier.
List S3 Access Lifecycles in order of hottest (most accessible) to coldest (rarely accessed) and/or cost.
- Standard
- Standard Intelligent Tiering
- Standard Infrequently Accessed (IA)
- Standard IA - One Zone
- AWS Glacier
- AWS Deep Glacier
AWS Storage Gateway - Tape Gateway
AWS Storage Gateway - Tape Gateway allows moving tape backups to the cloud.
AWS Storage Gateway - Volume Gateway
You can configure the AWS Storage Gateway service as a Volume Gateway to present cloud-based iSCSI block storage volumes to your on-premises applications.
Volume Gateway stores and manages on-premises data in Amazon S3 on your behalf and operates in either cache mode or stored mode.
AWS Storage Gateway - Cached Volume Gateway mode
primary data is stored in Amazon S3, while retaining your frequently accessed data locally in the cache for low latency access
AWS Storage Gateway - File Gateway
AWS Storage Gateway’s file interface, or file gateway, offers you a seamless way to connect to the cloud in order to store application data files and backup images as durable objects on Amazon S3 cloud storage. File gateway offers SMB or NFS-based access to data in Amazon S3 with local caching
AWS Storage Gateway
The service provides three different types of gateways – Tape Gateway, File Gateway, and Volume Gateway – that seamlessly connect on-premises applications to cloud storage, caching data locally for low-latency access.
Supported S3 lifecycle transitions
Waterfall approach with the following sequence Std, IA, Intelligence Tiering, One Zone IA, Glacier, and Deep Glacier
The S3 Standard storage class to any other storage class.
Any storage class to the S3 Glacier or S3 Glacier Deep Archive storage classes.
The S3 Standard-IA storage class to the S3 Intelligent-Tiering or S3 One Zone-IA storage classes.
The S3 Intelligent-Tiering storage class to the S3 One Zone-IA storage class.
The S3 Glacier storage class to the S3 Glacier Deep Archive storage class.
RRS can’t be transitioned from any storage class
S3 Glacier Deep Archive
Use for archiving data that rarely needs to be accessed. Minimum storage duration period of 180 days and a default retrieval time of 12 hours. If you have deleted, overwritten, or transitioned to a different storage class an object before the 180-day minimum, you are charged for 180 days.
S3 Glacier
Use for archives where portions of the data might need to be retrieved in minutes. Data stored in the S3 Glacier storage class has a minimum storage duration period of 90 days and can be accessed in as little as 1-5 minutes using expedited retrieval. If you have deleted, overwritten, or transitioned to a different storage class an object before the 90-day minimum, you are charged for 90 days.
• Expedited retrievals are typically made available within 1 – 5 minutes. • Standard retrievals typically complete within 3 – 5 hours. • Bulk retrievals typically complete within 5 – 12 hours.
S3 IA, S3 One Zone IA, Typical Use Cases
S3 Standard-IA and S3 One Zone-IA storage classes are designed for long-lived and infrequently accessed data. (IA stands for infrequent access.) S3 Standard-IA and S3 One Zone-IA objects are available for millisecond access (similar to the S3 Standard storage class). Amazon S3 charges a retrieval fee for these objects, so they are most suitable for infrequently accessed data.
S3 Standard-IA — Use for your primary or only copy of data that can’t be re-created.
S3 One Zone-IA — Use if you can re-create the data if the Availability Zone fails, and for object replicas when setting S3 Cross-Region Replication (CRR).
S3 Reduced Redundancy
What happens if data is lost?
What is the annual expected loss percentage/decimal?
The Reduced Redundancy Storage (RRS) storage class is designed for noncritical, reproducible data that can be stored with less redundancy than the S3 Standard storage class.
For durability, RRS objects have an average annual expected loss of 0.01 percent of objects. If an RRS object is lost, when requests are made to that object, Amazon S3 returns a 405 error.
S3 Standard
The default storage class. If you don’t specify the storage class when you upload an object, Amazon S3 assigns the S3 Standard storage class.
S3 Intelligent-Tiering storage class
designed to optimize costs by automatically moving data to the most cost-effective access tier, without performance impact or operational overhead. It works by storing objects in two access tiers: one tier that is optimized for frequent access and another lower-cost tier that is optimized for infrequent access. The minimum storage duration charge is 30 days
AWS Snowmobile
Exabyte scale data transfer - tracker trailer
Transfer 100 PB/s
Enhanced security features - GPS tracking, alarm, 24/7 video surveillance, encryption etc.
AWS Snowball & AWS Snowball Edge Optimized
Pedabyte scale transport appliance which is shipped to you to attach to your local network and transfer files directly to it.
Device will be shipped back to Amazon and data transferred to S3 into the network. Elink shipping label will automatically update and can be track via SNS or console.
Edge Optimized - ideal for transfer scenarios that require additional compute in remote, disconnected or harsh environments
AWS S3 Acceleration - what is it and when do you use it
Allows fast and easy data transfer into S3 by using CloudFront’s edge locations, where the data is routed to S3 over optimized network paths
Use it when:
- Have customers all over the world uploading to central bucket
- Transfer gigs or TBs of data across continents
- Underutilize the available bandwidth when uploading to S3 over the internet
AWS Transfer for SFTP
Fully managed highly available SFTP, service that enables applications to transfer files over SFTP directly to S3
You create server, set up user accounts and associate server with one or more S3 buckets
AWS Data Sync
How is the service billed?
Online data transfer service that simplifies, automates, and accelerates copying large amounts of data between on-premises storage systems and AWS Storage services, as well as between AWS Storage services.
DataSync can copy data between Network File System (NFS), Server Message Block (SMB) file servers, Hadoop Distributed File Systems (HDFS), self-managed object storage, AWS Snowcone, Amazon Simple Storage Service (Amazon S3) buckets, Amazon Elastic File System (Amazon EFS) file systems, and Amazon FSx for Windows File Server file systems.
Only pay for the data you copy
AWS S3 CORS
How is it configured?
Allows configuration of your bucket to allow cross origin requests by defining:
- Origins that you allow to access bucket
- The HTTP methods that will support each origin
- Other operations specific info (e.g. allowed heards, max age, etc.)
S3 versioning
Allows you to recover objects from accidental deletion or overwrite
S3 Object Lock
S3 Object Lock for data retention or protection
Use retention period for locking an object for fixed time or a legal hold for a lock until explicitly removed
S3 Block Public Access - What does this restrict
Enabled by default
- Block new public ACLs and uploading public objects
- Remove public access granted through public ACLs
- Block new public bucket policies
- Block public and cross-account access to buckets that have public policies
AWS S3 Default Permissions
Only resource owner, an AWS account that created it, by default can access S3 resources - buckets, objects, sub-resources
Amazon S3
Object-level storage - change in part of the file requires the whole file to be re-uploaded
Object size limit: 5TB
Stored redundantly across multiple facilities
Supports event notifications that can be sent to you or trigger other processes (e.g. Lamda)
What does EBS optimization, optimize?
Network Traffic optimization
List Amazon EBS SSD Volumes in order of performance
io2 Block Express
io1 and io2
gp3 and gp2
What EBS volume supports multi attach
io2 and io1
What is the requirement for EBS volumes with respect to the instance(s) they are attached?
Must be in the same AZ
Termination Protection (EBS)
Is this given by default?
What property governs if it is enabled or disabled?
keeps the volume/data when the instance is terminated
turned off by default
Modify ‘DeleteOnTermination’
What type of EBS Volumes can’t be a boot volume?
Throughput Optimized - HDD st1
List types of EBS and type of volumes for each EBS
SSD
- General Purpose SSD - gp2, gp3
- Provisioned IOPS SSD - io1, io2, io2 Block Express
Throughput Optimized HDD - st1
Cold HDD - sc1
Previous Generation - standard
How is EFS billed?
Pay only for resources used
SSE-S3
When you use Server-Side Encryption with Amazon S3-Managed Keys (SSE-S3), each object is encrypted with a unique key. As an additional safeguard, it encrypts the key itself with a root key that it regularly rotates. Amazon S3 server-side encryption uses one of the strongest block ciphers available
SSE-KMS
Server-Side Encryption with AWS KMS keys (SSE-KMS) is similar to SSE-S3, but with some additional benefits and charges for using this service. There are separate permissions for the use of a KMS key that provides added protection against unauthorized access of your objects in Amazon S3. SSE-KMS also provides you with an audit trail that shows when your KMS key was used and by whom. Additionally, you can create and manage customer managed keys or use AWS managed keys that are unique to you, your service, and your Region
SSE-C
Server Side Encryption with Customer Provided keys, you manage keys, S3 manages encryption
What are the 4 mutually exclusive options for SSE in S3
SSE-S3
SSE-KMS
SSE-C
Client side encryption
When should you use snowmobile vs snowball?
Use snowball for less than 10PB or distributed in multiple locations - will need multiple devices for more than 80TB
and snowmobile for more than 10PB
Amazon FSx Windows
for Windows File Server provides fully managed, highly reliable file storage that is accessible over the industry-standard Service Message Block (SMB) protocol.
It is built on Windows Server, delivering a wide range of administrative features such as user quotas, end-user file restore, and Microsoft Active Directory (AD) integration. Amazon FSx supports the use of Microsoft’s Distributed File System (DFS) to organize shares into a single folder structure up to hundreds of PB in size.
Amazon FSx for Lustre
Typical use case
Amazon FSx for Lustre provides a high-performance file system optimized for fast processing of workloads such as machine learning, high-performance computing (HPC), video processing, financial modeling, and electronic design automation (EDA).
FSx for Lustre is compatible with the most popular Linux-based AMIs
FSx for Lustre file systems can also be linked to Amazon S3 buckets, allowing you to access and process data concurrently from both a high-performance file system and from the S3 API.
Amazon EFS
Amazon Elastic File System (Amazon EFS) automatically grows and shrinks as you add and remove files with no need for management or provisioning
provides a simple, scalable, fully managed elastic NFS file system for use with AWS Cloud services and on-premises resources
What can mount EFS file systems?
EC2
ECS, EKS, Fargate
Lambda
On Prem Servers
Scope of EFS
Regional service
Amazon S3 Transfer Acceleration
enables fast, easy, and secure transfers of files over long distances between your client and an S3 bucket using Amazon CloudFront’s globally distributed edge locations
EC2 Instance Store
Ideal use case
provides temporary block-level storage for your instance located on disks that are physically attached to the host computer
ideal for temporary storage of information that changes frequently
Instance Store Limitations
specify instance store volumes for an instance only when you launch the EC2
can’t detach an instance store volume from one instance and attach it to a different instance
data in an instance store persists only during the lifetime of its associated instance
What events render data in the instance store lost forever
The underlying disk drive fails
The instance stops
The instance hibernates
The instance terminates
Amazon FSx
Deployment options
OS Support
cost effective high-performance file systems in the cloud
fully managed service, it handles hardware provisioning, patching, and backups
Single-AZ or Multi-AZ deployment options based on your high availability requirements
offer connectivity to Linux, Windows, and macOS users and applications
AWS Storage Gateway - Stored Volume Gateway mode
Primary data is stored locally and your entire dataset is available for low latency access on premises while also asynchronously getting backed up to Amazon S3
io2
EBS Provisioned IOPS SSD (io2)
new generation of the Provisioned IOPS SSD volumes - designed to be better than io1 but at the same cost
io2 Block Express
EBS Provisioned IOPS SSD (io2 Block Express)
offers the highest performance block storage in the cloud
higher throughput, IOPS, and capacity than io2 volumes, along with sub-millisecond latency
purpose-built to meet the performance and latency requirements of the most demanding applications
io1
EBS Provisioned IOPS SSD (io1)
backed by solid-state drives (SSDs) and is a high performance EBS storage option designed for critical, I/O intensive database and application workloads, as well as throughput-intensive database and data warehouse workloads
gp3
EBS General Purpose SSD (gp3)
ideal for a wide variety of applications that require high performance at low cost, including virtual desktops, medium sized single instance databases, low-latency interactive apps, dev & test, boot volumes
gp2
EBS General Purpose SSD (gp2)
default EBS volume type for Amazon EC2 instances
backed by solid-state drives (SSDs) and are suitable for a broad range of transactional workloads, including dev/test environments, low-latency interactive applications, and boot volumes
st1
Throughput Optimized HDD
backed by hard disk drives (HDDs) and is ideal for frequently accessed, throughput-intensive workloads with large datasets and large I/O sizes, such as MapReduce, Kafka, log processing, data warehouse, and ETL workloads
sc1
Cold HDD (sc1)
backed by hard disk drives (HDDs) and provides the lowest cost per GB of all EBS volume types
deal for less frequently accessed workloads with large, cold datasets