AWS Storage Extras Flashcards
AWS Snow Family
Highly Secure portable devices.
They collect and process data at edge OR migrate data into and out of AWS
Data Migration Snow devices
Snowcone
Snowball Edge
Snowmobile
Edge computing
Snowcone
Snowball Edge
Why would you want to use AWS Snow Family?
Because of the time to transfer huge amounts of data. Takes hours to days.
Perfect for data migration
How does AWS Snow Family work?
You order AWS Snowball online. It gets shipped to you. You add the data in, and then ship it to AWS. They import/export the data into your Amazon S3 bucket
SnowBall Edge (data transfers)
Move data in or out of AWS
Pay per data transfer job
Block storage and Amazon S3-compatible object storage
What are the 2 flavours of SnowBall Edge?
Storage Optimised - 80 TB of HDD
Compute Optimised - 42 TB of HDD or 28 TB of NVMe
SnowBall Edge use cases
Large data cloud migrations
Decommission Data Centre
Disaster Recovery
AWS Snowcone & Snowcone SSD
Small
Portable computing
rugged & secure
for harsh environments
Snowcone use cases?
edge computing, storage, data transfer
Snowcone size
8 TB of HDD
Snowcone SSD
14 TB of SSD
What type of constrains do you need to have to use snowcone instead of snowball?
Space constrains, this is tiny
What do you need to provide yourself for snowcone?
Battery / cables
How do you transfer data from snowcone to AWS?
By sending it offline (shipping)
OR
Connect to internet and use AWS DataSync to send data
AWS Snowmobile
For transferring exabytes
1 EB = 1000 PB = 1,000,000 TBs
What is the capacity of AWS Snowmobile?
100 PB (can use multiple in parallel)
How is a snowmobile protected?
Temp controlled tuck
GPS
24/7 video surveillance
When would you choose Snowmobile over Snowball?
When you need to transfer more than 10 PB
How does data migration work with snowball devices?
Request snowball from AWS console
install snowball client or AWS OpsHub on your servers
connect to servers and copy using client
ship back device to AWS facility
data are loaded to S3 bucket
snowball is wiped completely
How does data migration work with snowball devices?
Request snowball from AWS console
install snowball client or AWS OpsHub on your servers
connect to servers and copy using client
ship back device to AWS facility
data are loaded to S3 bucket
snowball is wiped completely
How can you use snowball devices for Edge Computing?
Process data while created on edge location
Snowball edge / snowcone can do this
Perfect for limited / no internet access locations
Edge Computing use cases
Preprocess data
Machine learning at the edge
Transcoding media streams
What can you do if you need to with edge computing after you are done?
Can also send it back to AWS to data transfer
When would you use edge computing?
If you are at a location that has no / limited internet access
limited / no computing power
What are some Edge Computing examples?
Truck on road, ship on sea, mining station underground
What are the deployment options of Edge computing?
1 or 3 year options discounted pricing
AWS OpsHub
Software to install on PC or laptop, to manage snow family devices. Graphical interface
How do you transfer data from Snowball into Glacier?
Order Snowball, add the data in, import it to amazon S3, create an S3 lifecycle policy that moves the data into glacier
What is Amazon FSx?
Allows to launch 3d party-high performance file systems on AWS
As a fully managed service
What are the 4 FSx types?
For Lustre
For Windows File Server
For NetApp ONTAP
For OpenZFS
FSx for Windows (File Server)
fully managed windows file system share drive
supports SMB protocol & Windows NTFS
Microsoft AD integration, ACL, user quotas
Can be mounted on EC2 instances (Linux)
Supports Microsofts Distributed File Systems (DFS) Name spaces (group files across multiple FS)
Benefits of FSx for Windows?
scale up to 10s of GB/s, millions of IOPS, 100s PB of data
SSD & HDD
Can be accessed from on-prem infra (VPN or Direct Connect)
Can be multi-AZ (HA)
Data backed to S3 d aily
Amazon FSx for Lustre
Linux & Cluster (derived)
Type of parallel distributed file system for large-scale computing
Used for ML, High Performance Computing (HPC)
Video processing, financial modeling, electronic design automation
SSD & HDD
Seamless Integration with S3
can read S3 as a file system through FSx
Can write output of the computations back to S3 (through FSx)
Can be used on-prem servers (VPN or Direct Connect)
2 FSx File System Deployment options (Lustre)
Scratch File System
Persistent FIle System
Scratch File System (Lustre)
Temporary storage
Data not replicated (no persistence)
High burst
Usage: short term processing optimise costs
Persistent File System (Lustre)
Long-term storage
Data is replicated within same AZ
Replace failed files within minutes
Usage: Long-term processing, sensitive data
FSx for NetApp ONTAP
Managed NetApp ONTAP on AWS
FS compatible with NFS, SMB, iSCSI protocol
Move workloads running on ONTAP or NAS to AWS
Works with
Linux
Win
MacOS
VMware Cloud on AWS
Amazon Workspaces & AppStream 2.0
Amazon EC2, ECS, and EKS
Storage shrinks or grows auto (AS)
Snapshots, replication, low-cost, compression and data de-duplication
Point-in-time instantaneous cloning (helpful for testing new workloads
Amazon FSx for OpenZFS
Managed fiel system on AWS
File system compatible with NFS (v3,v4,v4.1,v4.1)
Move workloads running on ZFS to AWS
Works with
Linux
Win
MacOS
VMware Cloud on AWS
Workspaces & Appstream 2.0
Amazon EC2, ECS, EKS
Up to 1,000,000 IOPS with <0.5ms latency
Snapshot, compression and low -cost
Point-in-time instantaneous cloning
Hybrid Cloud Storage
On-prem & Cloud infrastructure
How do you expose S3 to On-Prem?
AWS Storage Gateway
What are the 2 Block storage for AWS?
Amazon EBS
EC2 Instance Store
2 File Storage in AWS
Amazon EFS
Amazon FSx
2 Object Storage in AWS
Amazon S3
Amazon Glacier
What is AWS Storage Gateway?
Bridge between on-prem data and cloud data
Use cases for AWS Storage Gateway?
disaster recovery
backup & restore
tiered storage
on-premise cache & low-latency file access
Different Kinds of Storage Gateways
S3 File Gateway
FSx File Gateway
Volume Gateway
Tape Gateway
S3 File Gateway architecture
1 Corporate Data centre & AWS Cloud
You have your app server on-prem and you add an S3 File Gateway.
In the cloud you have your bucket (s3 standard, standard IA, One Zone IA, INtelligent-Tiering) and a Lifecycle policy for S3 Glacier .
Then you transfer data through HTTPS from cloud and on-prem.
Protocol used is NFS or SMB
SMB protocol has integration with AD for user auth
Frequently accessed data are cached
FSx File Gateway
Native access to Amazon FSx for Win File Server
Local Cache for frequently accessed data
Native compatibility with SMB, NTFS, AD
useful for group file shares and home directories
FSx File Gateway Architecture
SMB client and FSx File Gateway on-prem. Amazon FSx for WIndows File Server in AWS Cloud.
Main reason to use FSx File Gateway?
Local Cache for frequently accessed data
Volume Gateway
Block storage using iSCSI protocol backed by S3
What does Volume Gateway do?
Backed by EBS snapshots which can restore on-prem volumes
Cached Volumes - low latency access to most recent data
Stored Volumes - entire dataset is on prem, scheduled for backup to S3
Volume Gateway architecture
App server in on prem data centre. You also have a Volume Gateway and connect both with iSCSI interface.
The gateway contacts via HTTPS the S3 Bucket in the cloud which also contacts the Amazon EBS Snapshots (all in the cloud)
Why do you want to use Volume Gateway
To backup your on-prem volumes
Tape Gateway
Backup for physical tapes in the cloud
What is Tape Gateway using?
Virtual Tape Library (VTL) backed up by S3 and Glacier
Tape Gateway Architecture
On prem - Backup server, iSCSI interface chooses media changer or tape drive. Then the tape gateway connects via HTTPS to the cloud.
Directly onto the S3 that stores Virtual TApes, and also the archived tapes stored in amazon glacier
What is Storage Gateway - Hardware appliance?
Essentially a mini server that can be installed on prem and act as a file, volume, tape gateway.
Can be ordered if you do not own any physical gateways
When is a Storage Gateway - Hardware Appliance useful?
For daily NFS backups in small data centres
AWS Transfer Family
File transfers into and out of Amazon S3 or EFS using FTP (only)
AWS Transfer Family protocols
FTP
SFTP
FTPS
Benefits of Transfer Family
Managed infrastructure
Scalable
Reliable
HA (multi az)
Pay per endpoint per hour
store and manage users credentials within the service
Transfer Family use cases
sharing files
public datasets
CRM
ERP
AWS Transfer Family Architecture
User using FTP client goes through Route 53 (optional) to SFTP, FTPS, FTP (VPC only) and using an IAM role they can connect to S3 or EFS.
users can also be auth using LDAP or MS AD.
AWS DataSync
Move large amounts of data to and from
- on prem or other cloud to AWS - needs agent
- AWS to AWS (no agent needed)
Can synchronise - S3, EFS, FSx (win, lustre, NetApp, OpenZFS)
Scheduled Replication (hourly, daily, weekly)
File permissions and metadata are preserved (NFS POSIX, SMB)
One agent can task 10Gbps, can have bandwidth limit
DataSync Architecture
NFS or SMB server on prem. connects to DataSync Agent which allows connection over TLS to AWS DataSync in the cloud. Then from there you can transfer/sync data to any AWS Storage Resources. Can also transfer and sync data from AWS to on prem.
DataSync - what happens if you want to use it but do not have the bandwidth?
You can use AWS Snowcone because it has the DataSync agent pre-installed. Upload the data, run the agent, ship it to AWS and the Sync will happen there.
Can you use AWS DataSync for transferring between AWS storage devices?
Yes - you can transfer directly between S3, EFS, FSx. And keeps the metadata
What can you use to preserve metadata and file permissions when you do a transfer between storage services?
AWS DataSync
What do you need to run to be able to connect and transfer data to NFS or SMB server?
DataSync Agent
S3
Object Storage
S3 Glacier
Object Archival
EBS Volume
Network Storage for one EC2 instance at a time
Instance Storage
Physical storage for EC2 instance (high IOPS)
EFS
Network File System for Linux instances, POSIX filesystem
FSx for Windows
Network File System for Windows Servers
FSx for Lustre
High performance Computing Linux file system
FSx for NetApp ONTAP
High OS compatibility
FSx for OpenZFS
Managed ZFS file system
Storage Gateway
S3 & FSx File Gateway, Volume Gateway (cache & stored), Tape Gateway
Storage Gateway
S3 & FSx File Gateway, Volume Gateway (cache & stored), Tape Gateway
Storage Gateway
S3 & FSx File Gateway, Volume Gateway (cache & stored), Tape Gateway
Transfer Family
FTP, FTPS, SFTP interface on top of Amazon S3 or Amazon EFS
DataSync
Schedule data sync from on-premises to AWS, or AWS to AWS
Snowcone/ / Snowball / Snowmobile
to move large amount of data to the cloud, physically
Database
for specific workloads, usually with indexing and querying