Storage Flashcards
What is an AWS Hybrid Storage Service? What are some examples?
Hybrid Storage Services are services the connect on-prem storage, applications and workloads with the AWS Cloud. Some examples are:
- AWS Storage Gateway
- AWS Outposts
- Amazon File Cache
Describe the function of each of the following AWS Services:
- AWS Storage Gateway
- AWS Outposts
- Amazon File Cache
- AWS Storage Gateway: Connects on-prem storage with storage services in the cloud. Also provides connection management and caching
- AWS Outposts: Functions as an extension of the AWS Cloud inside your on-prem data center with connections to your on-prem network and systems
- Amazon File Cache: Provides a high speed cache on AWS that serves a temporary, high-performance storage for data on premises or with AWS
Complete the following statements:
- __________ functions as an extension of the AWS Cloud inside your on-prem data center with connections to your on-prem network and systems
- ___________ provides a high speed cache on AWS that serves a temporary, high-performance storage for data on premises or with AWS
- ____________ connects on-prem storage with storage services in the cloud. Also provides connection management and caching
- AWS Outposts
- Amazon File Cache
- AWS Storage Gateway
Describe the expected function of each of the following AWS Services:
- AWS Storage Gateway
- AWS Outposts
- Amazon File Cache
- On-premises gateway to AWS Cloud
- On-premises AWS Storage
- In-cloud caching of on-premises data
What are the expected use cases of AWS Storage Gateway?
- Move backups and archives to the cloud
- Reduce on-prem storage with cloud-backed file shares
- Provide on-prem applications low-latency access to data stored in S3
- Provide data lake access for pre-processing and post-processing workflows
What are the expected use cases of AWS Outposts?
- Low-latency, local data processing needs for on-prem locations, such as retail stores, branch offices, healthcare provider locations, financial institutions and factory floors
- Acces to cloud native services while fulfilling data residency requirements
What are the expected use cases of Amazon File Cache?
- Boost visual effects rendering and transcoding workloads to AWS to meet peak compute needs during media production
- Accelerate high performance compluting cloud bursting workloads
- Speed up access to your on-premises and in-cloud datasets
- Run advanced analytics on petabytes of on-prem data
What are the different types of storage gateway offered by AWS? Describe each of them.
- S3 File Gateway: Allows file storage on S3 buckets through both SMB and NFS protocols, while also performing local caching. Can acess all types of S3 data with the exception of S3 Glacier
- FSx File Gateway: Allows on-prem access to Amazon FSx for Windows File Server using SMB protocol. Useful for file systems in general.
- Volume gateway: Allows storage on S3 backed by EBS snapshots through the iSCSI protocol, and operates either in cached mode, where it maintains a local data cache while moving the data to Amazon S3 as its primary location, or in stored mode, where the primary storage location is the gateway and it is backed-up on schedule to S3
- Tape Gateway: Allows the storage of on-prem tape backups created using iSCSI to S3, on Virtual Tapes on a Virtual Tape Library (VTL),
How can you integrate storage gateway with active directory?
Using the SMB Protocol, since it has integrations with AD for user authentication
Declare True or False for the following statements:
- Tape Gateway by deafult stores data on S3 IA
- Storage Gateway uses a read-through write-back cache to store your data
- All data in transit is encrypted
- Pricing for Storage Gateway is calculated only based on the amount of data stored and the amount of data transfered out of AWS.
- For Storage Gateway to work, it is necessary to setup either the Storage Gateway Appliance or VM on-prem or EC2.
- Whenever you create a storage gateway, it works for any AWS Region
-False, it uses S3 Glacier or Glacier Deep Archive
-True
-True
-False, it is based on the gateway type, the AWS storage used and either the actual amount of storage or the allocated amount of storage that you use
-True
-False, it works only for the region where it was setup
What does NFS and SMB mean?
Network File System and System Message Block
What AWS services are compatible with AWS Outposts?
- S3
- EBS
- Elasticache
- RDS
- EMR
- EC2
- ECS
- EKS
- Application Load Balancer
What is the Snow family of services on AWS? What services make it up?
The Snow family a group of services offered by AWS to perform the transfer of data on locations with limited internet bandwidth or to serve as storage on edge locations. It’s made-up of 3 services: Snowcone, Snowball Edge and Snow Mobile
What are the differences between Snowcone, Snowball Edge and Snow Mobile
- Their storage capacities are differente, with Snowcone accepting between 8TB HDD and 14TB SDD, Snowball Edge 80TB and Snow Mobile up to 100PB
- Their migration capacity is different, with Snowcone accepting upt to 24TB, Snowball Edge Petabytes and Snow Mobile Exabytes (better than Snowball afte 10 PB)
- Snowcone can perform migrations of either online or offline data, while the other ones accept only offline
- Snowcone comes with DataSync agent installed (what possibilitates online transfer)
True or False: The best Snow service to use on ML on Edge Locations is generally Snowcone
False, it is Snowball edge (Better computing, GPU Support, can be clustered)
What is AWS DataSync?
It is a service that allows you to move large amount of data between AWS, other clouds, edge locations and on-prem while mantaining high performance and security
How does AWS DataSync pricing work?
You pay a flat amount per GB transferred
True or false: When using DataSync, you need to configure the DataSync client on a VM before hand regardless of the trasnference type
False, you only need to configure a DataSync agent if you are not transfering from AWS to AWS
What are the possible destinations inside AWS used by DataSync?
-S3
-FSx
-EFS
What are the services offered on the File Transfer Family?
-File Transfer Protocol (FTP): Data transfer over TCP/IP
-File Transfer Protocol Secure (FTPS): Data transfer over TCP/IP using SSL/TLS encryption
-Secure File Transfer Protocol (SFTP): File transfer protocol using SSH
- Applicability Statement: Using HTTPS to transfer messages, especially Electronic Data Interchange (EDI) messages
What is the AWS Transfer Family? Where can it send data to?
It is a group of services that allow you to transfer file data into S3 or EFS using the File Transfer Protocol (FTP)
When should you se DataSync over the Transfer family?
DataSync is recommended for larger workloads and migrations while Transfer Family is recommended for more conventional application transfers
What is the cache type of Storage Gateway?
Least Recently Used (LRU)
True or False: The only type of FSx that Storage Gateway can connect to is FSx for Windows File Server
True
True or False: FSx File Gateway needs site-to-site VPN or Direct Connect to your FSx to work on-prem
True
What kind of Storage Gateway Endpoints are there?
- Public Endpoints: Connection with Storage Gateway over the internet
- VPC Endpoints: Connection through a VPC Endpoint over a private connection on AWS
- Federal Information Processing Standards (FIPS) 140-2 compliant endpoints—Storage Gateway connects to a public endpoint over the internet. This endpoint complies with FIPS standards to further protect sensitive information for regulated workloads in AWS GovCloud (US) AWS Regions.
Explain the differences between the control plane connection and the data plane connection on FSx File Gatteway
-The control plane connects the on-prem to the FSx File Gateway system and is used to manage it
- The data plane connects the on-prem to the FSx for Windows File Server, and is the connection through which the data travels
How long does the FSx FIle Gateway cache takes to refresh?
You can configure between 5 min and 30 days
True or False: FSx File Gateway can connect to FSx on other accounts
False, it can only connect to FSx on the same account
What kinds of S3 object are compatible with S3 File Gateway?
-Standard Access
-Standard IA
-One Zone IA
-Inteligent Tiering
True or False: You can create S3 File Gateway read replicas by creating new gateways connected to S3 buckets already on Storage Gateway and performing only read operations on them
True (might need object lock on S3)
True or False: to configure the S3 file gateway on an EC2 instance it is recommended to pass the necessary File Gateway image through EC2 metadata
False, there alsready pre-configured EC2 EMIs ready to be used as gateways for this task
What are the S3 data storage types?
-Standard
-Infrequent Access (IA)
-Inteligent Tiering
-One-zone Infrequent Access
-Glacier Instant Retrieval
-Glacier Flexible Retrieval (Former Glacier)
-Glacier Deep Archive
Complete with minimum time a obejct has to be stored on each S3 Storage Class before being deleted or transitioned to another class (you can delete before the duration, but you are billed for the entirety of it):
-Standard :___________
-IA :___________
-Intelligent Tiering:___________
-One-Zone IA:___________
-Glacier Instant Retrieval:___________
-Glacier Flexible Retrieval:___________
-Glacier Deep Archive:___________
-0 Days
-30 Days
-0 Days
-30 Days
-90 Days
-90 Days
-180 Days
True or False: All S3 Glacier and IA Storage Classes have a retrieval fee when their data is requested
True
Explain the difference between the S3 Glacier Storage Classes
-Glacier Instant Access: Good for data accessed once a quarter, with retrieval in milliseconds
-Glacier Flexible Access: Good for data accessed once a year, with retrieval ranging from minutes to hours
-Glacier Deep Archive: Good for data accessed less than once a yera, with retrieval taking hours
COmplete the following Statement regarding S3:
It is possible to use _______________ to automatically change S# objects’ storage classes based on their access patterns and time in storage
S3 Lifecycle configuration rules
How many tags can an S3 Object have?
10
S3 IA is recommender for files bigger than 128 kb and that will be used for at least 30 days. What happens if you delete an object smaller then 128 kb and tha lasted less than 30 days?
You are billed as if the object had 128 kb and it lasted 30 days
What options does S3 support for data security? Answer for data at transit and at rest.
SSL for data in transit and encryption for data at rest.
You are planning to store the backup of video thumbnail data that you can easily replicate if needed. This data has to occasionally be accessed. What is the best S3 Storage Class for this use case?
-One-Zone IA
What is the minum object size fo S3 Intelligent Tiering?
128 KB
S3 Glacier Flexible Retrieval has 3 types of retrieval that you can configure. What are those types?
-Expedited (1–5 mins)
-Standard (3–5 hours)
-Bulk (5-12 hours) free
True or false: Both Glacier Flexible Retrieval and Glacier Deep Archives accepts only files with at least 128KB
False, the minimum file size accepted is 40KB
S3 Glacier Deep Archive has 2 types of retrieval that you can configure. What are those types?
-Standard (within 12 hours)
-Bulk (within 48 hours)
What are the encryption types accepted by S3?
-SSE-S3: AWS responsible for managing key and encrypting data.
-SSE-KMS: Uses KMS to encrypt data.
-SS3-C: Encryption Keys managed by the customer.
-Glacier: All data is AES-256 encrypted, key is under AWS Control
Additionally, you can manually incrypt data on your side, not depending on S3 to do it.
True or False: Whenever an action is performed on S3, an event can be triggered on SNS, SQS or Lambda
True
What are the S3 resource based security features?
-Bucket Policies: Bucket wide rules for the objects in the bucket
- Object Access Control List (ACL): Finer grain
- Bucket Access Control List (ACL): Less common
What are some uses of S3 Bucket Policies?
-Grant public access
-Grant cross-account access
-Force objects to be encrypted at upload
What is an S3 pre-signed URL?
It is a URL that grants security credencials to whoever uses it, allowing the download and upload of private S3 objects
True or False: A pre-signed url’s default ttl is 3600 seconds
True
What is S3 Object Lock? What is it useful for?
It is a feature that blocks object deletion for a specific amount of time after its write. Useful for enabling WORM (Write once, read many) model
What is Glacier Vault Lock?
It is an S3 Object Lock that lasts indefinitely. Files written to it cannot be edited anymore. It is useful for auditing purposes.
What are S3 Access Points?
It is a feature that allows fine-grain control access to an S3 Bucket. Each access point can access one specific bucket prefix, and they each have their own DNS name and access point policy
What is S3 Object Lambda?
It is a feature you can setup where before being retrieved from an S3 bucket an object is first processed by an AWS Lambda Function. Commonly used alongside S3 Access points.
True or False: The only S3 Storage Class abailable in AWS Outposts is Standard
False, Outposts has its own Storage Class called S3 Outposts
True or False: S3 can replicate its files to S3 Buckets in any region
Correct, S3 supports both Same-region Replication (SRR) and Cross-Region Replication (CRR)
Uma aplicação bate no S3 com muita frequência, batendo no limite 3500 PUTS e 5500 GETS por segundo por prefixo. Como resolver esse problema?
Usar mais prefixos.
Wat is the number of GETS and PUTS and S3 prefix can accept per second?
3500 PUTS, 5500 GETS
S3 Multi-part upload is useful for optimizing uploads of large objects to S3. From what file size is Multi-part upload recommended and from what file size is Multi-part upload obligatory?
-Recommended: >100MB
-Obligatory: >5GB
What is the max file size accepted by S3?
5 TB
Explain S3 Transfer Acceleration
On S3 Transfer Acceleration, you increase the upload speed of a file to S3 by loading it to an AWS Edge Location first, which will then forward the file to the correct AWS Region. This strategy takes advantage of the increased speed inside the AWS network.
S3 Byte Range fetches are commonly used to increase S3 performance. Explain how they work.
S3 Byte Range Fetches download only the contents of a file inside a predefined byte range (ex: 100-200). This is helpful because it allows you to et only the part of the file you need and to parallelize the GET of an S3 object by performing multiple Byte-Range fetches at once.
What is S3 select?
S3 select allows you to filter the data from your GET by performing SQL operations before loading your data.