10 - Storage Flashcards

Question 1

Q

EBS Volume Types

Answer

A

1) EBS Volumes come in 6 types
* gp2 / gp3 (SSD): General purpose SSD volume that balances price and performance for a wide variety of workloads
* io1 / io2 (SSD): Highest-performance SSD volume for mission-critical low-latency or high-throughput workloads
* st1 (HDD): Low cost HDD volume designed for frequently accessed, throughput-intensive workloads
* sc1 (HDD): Lowest cost HDD volume designed for less frequently accessed workloads
2) Only gp2/gp3 and io1/io2 can be used as boot volumes

Question 2

Q

EBS Volume Types Summary

Answer

A

1) General Purpose SSD, gp3/gp2, 1 GiB - 16 TiB, 16,000 Max IOPS, 250-1000 MiB/s
2) Provisioned IOPS SSD, io2/io1, 4 GiB - 16 TiB, 32,000-64,000 (Nitro) Max IOPS, 500 - 1,000 (Nitro) MiB/s
3) Throughput Optimized HDD, st1, 125 GiB - 16 TiB, 500 Max IOPS, 500 MiB/s
4) Cold HDD, sc1, 125 GiB - 16 TiB, 250 Max IOPS, 250 MiB/s

Question 3

Q

Amazon S3 Overview

Answer

A

1) Objects (files) have a Key
2) The key is the FULL path:
• s3://my-bucket/my_file.txt
• s3://my-bucket/my_folder1/another_folder/my_file.txt
3) The key is composed of prefix + object name
• s3://my-bucket/my_folder1/another_folder/my_file.txt
4) Object values are the content of the body:
• Max Object Size is 5TB (5000GB)
• If uploading more than 5GB, must use “multi-part upload”

Question 4

Q

S3 Encryption for Objects

• SSE-S3, SSE-KMS, SSE-C, Client Side Encryption

Answer

A

SSE-S3
• SSE-S3: encryption using keys handled & managed by Amazon S3
• Object is encrypted server side
• AES-256 encryption type
• Must set header: “x-amz-server-side-encryption”: “AES256”

SSE-KMS
• SSE-KMS: encryption using keys handled & managed by KMS
• KMS Advantages: user control + audit trail
• Object is encrypted server side
• Must set header: “x-amz-server-side-encryption”: ”aws:kms”

SSE-C
• SSE-C: server-side encryption using data keys fully managed by the customer outside of AWS
• Amazon S3 does not store the encryption key you provide
• HTTPS must be used
• Encryption key must provided in HTTP headers, for every HTTP request made

Client Side Encryption
• Client library such as the Amazon S3 Encryption Client
• Clients must encrypt data themselves before sending to S3
• Clients must decrypt data themselves when retrieving from S3
• Customer fully manages the keys and encryption cycle

Question 5

Q

S3 Security

Answer

A

1) User based
• IAM policies - which API calls should be allowed for a specific user from IAM console

2) Resource Based
• Bucket Policies - bucket wide rules from the S3 console - allows cross account
• Object Access Control List (ACL) – finer grain
• Bucket Access Control List (ACL) – less common

3) Note: an IAM principal can access an S3 object if
• the user IAM permissions allow it OR the resource policy ALLOWS it
• AND there’s no explicit DENY

Question 6

Q

S3 Bucket Policies

Answer

A

1) JSON based policies
• Resources: buckets and objects
• Actions: Set of API to Allow or Deny
• Effect: Allow / Deny
• Principal: The account or user to apply the policy to

2) Use S3 bucket for policy to:
• Grant public access to the bucket
• Force objects to be encrypted at upload
• Grant access to another account (Cross Account)

Question 7

Q

S3 Bucket settings for Block Public Access

Answer

A

1) Block public access to buckets and objects granted through
• new access control lists (ACLs)
• any access control lists (ACLs)
• new public bucket or access point policies

2) Block public and cross-account access to buckets and objects through any public bucket or access point policies

Question 8

Q

S3 CORS
• CORS means Cross-Origin Resource Sharing
• The requests won’t be fulfilled unless the other origin allows for the
requests, using CORS Headers (ex: Access-Control-Allow-Origin)

Answer

A

If a client does a cross-origin request on our S3 bucket, we need to enable the correct CORS headers
It’s a popular exam question
You can allow for a specific origin or for * (all origins)

Question 9

Q

S3 Replication (CRR & SRR)
• Must enable versioning in source and destination
• Cross Region Replication (CRR)
• Same Region Replication (SRR)

Answer

A

Buckets can be in different accounts
Copying is asynchronous
Must give proper IAM permissions to S3
CRR - Use cases: compliance, lower latency access, replication across accounts
SRR – Use cases: log aggregation, live replication between production and test accounts

Question 10

Q

S3 Pre-Signed URLs
• Can generate pre-signed URLs using SDK or CLI
• For downloads (easy, can use the CLI)
• For uploads (harder, must use the SDK)

Answer

A

1) Valid for a default of 3600 seconds, can change timeout with –expires-in [TIME_BY_SECONDS] argument
2) Users given a pre-signed URL inherit the permissions of the person who generated the URL for GET / PUT

3) Examples :
• Allow only logged-in users to download a premium video on your S3 bucket
• Allow an ever changing list of users to download files by generating URLs dynamically
• Allow temporarily a user to upload a file to a precise location in our bucket

Question 11

Q

Amazon Glacier & Glacier Deep Archive

Answer

A

1) Amazon Glacier – 3 retrieval options:
• Expedited (1 to 5 minutes)
• Standard (3 to 5 hours)
• Bulk (5 to 12 hours)
• Minimum storage duration of 90 days

2) Amazon Glacier Deep Archive – for long term storage – cheaper:
• Standard (12 hours)
• Bulk (48 hours)
• Minimum storage duration of 180 days

Question 12

Q

S3 Performance

Answer

A

Multi-Part upload:
• recommended for files > 100MB, must use for files > 5GB
• Can help parallelise uploads (speed up transfers)

S3 Transfer Acceleration
• Increase transfer speed by transferring file to an AWS edge location which will forward the data to the S3 bucket in the target region
• Compatible with multi-part upload

S3 Byte-Range Fetches
• Parallelise GETs by requesting specific byte ranges
• Better resilience in case of failures

Question 13

Q

S3 Select & Glacier Select

Answer

A

Retrieve less data using SQL by performing server side filtering
Can filter by rows & columns (simple SQL statements)
Less network transfer, less CPU cost client-side

Question 14

Q

Amazon FSx for Windows (File Server)

• FSx for Windows is a fully managed Windows file system share drive

Answer

A

Supports SMB protocol & Windows NTFS
Microsoft Active Directory integration, ACLs, user quotas
Built on SSD, scale up to 10s of GB/s, millions of IOPS, 100s PB of data
Can be accessed from your on-premise infrastructure
Can be configured to be Multi-AZ (high availability)
Data is backed-up daily to S3

Question 15

Q

Amazon FSx for Lustre
• Lustre is a type of parallel distributed file system, for large-scale computing
• The name Lustre is derived from “Linux” and “cluster”

Answer

A

• Machine Learning, High Performance Computing (HPC)
• Video Processing, Financial Modeling, Electronic Design Automation
• Scales up to 100s GB/s, millions of IOPS, sub-ms latencies
• Seamless integration with S3
- Can “read S3” as a file system (through FSx)
- Can write the output of the computations back to S3 (through FSx)
• Can be used from on-premise servers

Question 16

Q

AWS Storage Gateway

• Bridge between on-premises data and cloud data in S3

Answer

Study These Flashcards

A

• Use cases: disaster recovery, backup & restore, tiered storage

3 types of Storage Gateway:
File Gateway
Volume Gateway
Tape Gateway

• Exam Tip: You need to know the differences between all 3!

File access / NFS – user auth with Active Directory => File Gateway (backed by S3)
Volumes / Block Storage / iSCSI => Volume gateway (backed by S3 with EBS snapshots)
VTL Tape solution / Backup with iSCSI = > Tape Gateway (backed by S3 and Glacier)
No on-premises virtualisation => Hardware Appliance

Question 17

Q

File Gateway

• Configured S3 buckets are accessible using the NFS and SMB protocol

Answer

Study These Flashcards

A

Supports S3 standard, S3 IA, S3 One Zone IA
Bucket access using IAM roles for each File Gateway
Most recently used data is cached in the file gateway
Can be mounted on many servers
Integrated with Active Directory (AD) for user authentication

Question 18

Q

Volume Gateway

• Block storage using iSCSI protocol backed by S3

Answer

Study These Flashcards

A

Backed by EBS snapshots which can help restore on-premises volumes!
Cached volumes: low latency access to most recent data
Stored volumes: entire dataset is on premise, scheduled backups to S3

Question 19

Q

Tape Gateway

Answer

Study These Flashcards

A

Some companies have backup processes using physical tapes (!)
With Tape Gateway, companies use the same processes but, in the cloud
Virtual Tape Library (VTL) backed by Amazon S3 and Glacier
Back up data using existing tape-based processes (and iSCSI interface)
Works with leading backup software vendors

10 - Storage Flashcards

(19 cards)