Snowball, Storage Gateway, FSx Flashcards

1
Q

Snowball

A

a huge box and that allows you to basically physically transport data in and out of AWS.

an alternative to moving data over the network

offline transfer and storage

a service that provides secure, rugged devices, so you can bring AWS computing and storage capabilities to your edge environments, and transfer data into and out of AWS. Those rugged devices are commonly referred to as AWS Snowball or AWS Snowball Edge devices.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Snowball what for?

A

if you need to run computing in rugged, austere, mobile, or disconnected (or intermittently connected) environments.

Also for large-scale data transfers and migrations when bandwidth is not available for use of a high-speed online transfer service, such as AWS DataSync.

it’s quite useful if you want to pre-process the data while the thing is moving.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Snowball you are going to pay

A

per data transfer job.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Snowball Edge

A

an edge computing and data transfer device provided by the AWS Snowball service. It has on-board storage and compute power that provides select AWS services for use in edge locations.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

How does Snowball Edge work?

A

You request one or more devices in the AWS Management Console .

The buckets, data, Amazon EC2 AMIs, and Lambda functions you select are automatically configured, encrypted, and preinstalled on your devices before they are shipped to you.

Once a device arrives, you connect it to your local network and set the IP address either manually or automatically with DHCP.

Then use the Snowball Edge client software, job manifest, and unlock code to verify the integrity of the Snowball Edge device or cluster, and unlock it for use. The manifest and unlock code are uniquely generated and crypto-logically bound to your account and the Snowball Edge shipped to you, and cannot be used with any other devices. Data copied to Snowball Edge is automatically encrypted and stored in the buckets you specify.

All logistics and shipping is done by Amazon, so when copying is complete and the device is ready to be returned, the E Ink shipping label will automatically update the return address, ensuring that the Snowball Edge device is delivered to the correct AWS facility. Once the device ships, you can receive tracking status via messages sent by Amazon Simple Notification Service (Amazon SNS), generated texts and emails, or directly from the console.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Snowmobile

A

a truck to transfer exabytes of data

one exabyte equal 1000 petabytes, equals 1 million terabytes,

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Snowball and Glacier

A

Snowball can not import data into Amazon Glacier directly. Ypu have to use Amazon S3 first, and then you’re going to use an S3 lifecycle policy to transition that data directly and immediately into glacier.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Hybrid Cloud

A

part of your infrastructure will be on the cloud on AWS

part of your infrastructure will also be on-premise.

This can be due to many reasons, maybe you have a long cloud migration, or security requirements or compliance requirements

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

idea behind Storage Gateway

A

S3, for example, is a proprietary storage technology. It’s not like NFS, which is standardized. So how do we expose the S3 data when we are with on-premise servers or on-premise computers?

Storage Gateway is going to give us access to S3
through a gateway which will expose standard API’s.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

how the storage works today in AWS in the Cloud-Native way

A

we have

  1. Block Storage which is EBS or EC2 Instance Store,
    that’s basically our volumes.
  2. Then we have file storage. That’s when we dealt with EFS and we’re storing files on a network file system.
  3. Then we have object when we were storing files and objects directly on S3 and Glacier.

Storage Gateway will bring a bridge to these solutions.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

three types of Storage Gateway

A
  1. File Gateway
  2. Volume Gateway
  3. Tape Gateway
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Storage Gateway use cases

A

where we wanna maybe bring the on-premise data into S3 or bridge it, is to do disaster recovery, back up
and restore, or maybe tiered storage.

when you have S3 buckets and you want them to be accessible using maybe the NFS (Network File System) protocol or the SMB protocol

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

File Gateway

A

allows us to view files on our local file system on-premise, but it will be backed by S3, Glacier

it stands between Application Server and S3/S3IA/Glacier and talks to the Server using NFS, but to S3/S3IA/Glacier over HTTPS

we have to setup a File Gateway on-premise.

So from our applications perspective, it seems like we’re talking to a local network file system, but the File Gateway actually does some magic behind the scenes and talks to S3 or Glacier.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Volume Gateway

A

when you want to have Block storage using iSCSI protocol backed by S3

we have to setup a Volume Gateway on-premise.

it stands between Application Server and S3 bucket with EBS snapshots and talks to the Server using iSCSI, but to S3 over HTTPS

the idea is that the EBS snapshots will be made from time to time and they will be in S3. This will help us restore on-premise volumes if we wanted to.

our application server is going to mount a volume from the Volume Gateway and for on-premise, it will look like it’s just the local volume, but the Volume Gateway will store this as Amazon EBS snapshots backed by S3.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Tape Gateway

A

Some companies still have processes to use physical tapes.

For this, you build a VTL, or Virtual Tape Library and it will be backed by Amazon S3 and Glacier. Tape Gateway is for a backup reason

Backup software will be connecting directly using iSCSI to the Tape Gateway will create a Virtual Tape library stored in S3 or Glacier. So it will talk to S3 or Glacier over HTTPS

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

File Gateway supports

A

S3 standard, S3 IA, One Zone IA,

each bucket will need to be accessed by the File Gateway and it will have it’s own IAM role.

17
Q

File Gateway cache

A

The most recently used data will be cached into the File Gateway. File Gateway will take our most active S3 objects and cache them locally.

18
Q

we have two options for Volume Gateway

A
  1. Cached volumes

2. Stored volume

19
Q

Cached volumes

A

a low latency access with the most recent data on your volumes

20
Q

Stored volume

A

an entire dataset that will be on-premise and it will have scheduled backups to S3. So from time to time it’ll go to S3.

21
Q

File gateway hardware appliance

A

using the file gateway, you need to use virtualization, but you have an alternative.

You can also install a file gateway hardware appliance
directly on premises to synchronize your files
from on premises into AWS.

22
Q

File gateway hardware appliance how

A

you actually have to go on amazon.com and buy it.
you can plug it in your own small data center and have it do the job of transferring your files from your on premises environment into AWS through the NFS.

23
Q

File gateway hardware appliance use case

A

if you have a small data center with no virtualization capability and you still need to perform a daily NFS backup,

then the answer is to buy and a hardware appliance
and plug it onto your small data center, and using the hardware appliance, you can do your daily NFS backup.

24
Q

FSx for Windows

A

EFS is a shared POSIX file system, used only by Linux EC2 instances, or on-premise machines.

Therefore, you can not use EFS with your Windows servers. So how do you share storage between your Windows servers? Amazon came up with FSx for Windows.

a fully managed Windows file system share drive.

think anytime you have shared storage or distributable file system for your Windows instances, this is Amazon FSx for Windows.

25
Q

FSx for Windows supports 3 important things

A

SMB protocol and Windows NTFS.

Active Directory integration,

26
Q

FSx for Windows scale

A

it has a massive scale, it can scale to 10s of gigabytes per second, millions of IOPS, and hundreds of petabytes of data.

27
Q

FSx for Windows availability

A

It can also be accessed from your on-premise infrastructure, and it can be configured to be Multi-AZ
and gets high availability.

data is backed up daily to Amazon S3, so you can always recover your file system directly from S3.

28
Q

Amazon FSx for Lustre

A

a type of parallel distributed file system for large-scale computing.
Linux and cluster: for Linux instances, it’s meant for large-scale computing.

29
Q

what do we use Lustre for?

A

Lustre for Machine Learning,

High Performance Computing or HPC.

we can also do video processing, financial modeling, electronic design animation, anything that requires a high level of distribution for your file system and your computation.

30
Q

Amazon FSx for Lustre scaling

A

it scales up to hundreds of gigabytes per second,
millions of IOPS, and has sub-millisecond latencies,
so it is really meant for High Performance Computing,

31
Q

Amazon FSx for Lustre and S3

A

has a seamless integration with S3. That means that you can read your S3 like it’s as a file system through FSx for Lustre, and you can write the output of whatever computation you’re doing back to S3

a way to expose your S3 buckets, as a file system as well, to your Linux instances.

can also be used from on-premise servers

32
Q

S3 - Glacier - EFS - FSx for Windows - FSx for Lustre - EBS volumes - Instance Storage - Storage Gateway - Snowball/Snowmobile - Database

A

S3: object storage, serverless, no need to provision capacity ahead of time, deep integration with so many database services

Glacier: object archival, store objects for a long period of time, retrieve it rarely, and retrieval takes a lot of time

EFS: a network file system for Linux instances, POSIX file system, accessible from all your EC2 instances at once, shared and across AZ.

FSx for Windows: same thing as EFS, but for Windows.

FSx for Lustre is Linux and cluster, so it’s for High Performance Computing Linux file system, insanely high IOPS, insanely big capacity, integration with S3

EBS volumes is your network storage for one EC2 instance at a time only. And it is bound to a specific availability zone that you create it in. To change the AZ, you will need to create a snapshot, move that snapshot over, and create a volume from it.

Instance Storage is going to be physical storage
for your EC2 instance. Because it’s attached from the hardware, it’s going to have a much higher IOPS than EBS. But the risk is that if your EC2 instance goes down, then you will lose that storage permanently.

Storage Gateway is going to be transporting files from on premise to AWS.

Snowball/Snowmobile to move large amount of data
to the cloud physically into S3.

Database which is a way of storing data. It’s for more specific workloads, and usually it’s going to be mixed with some indexing and some querying.

33
Q

You need to move hundreds of Terabytes into the cloud in S3, and after that pre-process it using many EC2 instances in order to clean the data. You have a 1 Gbit/s broadband and would like to optimize the process of moving the data and pre-processing it, in order to save time. What do you recommend?

A

Snowball Edge is the right answer as it comes with computing capabilities and allows use to pre-process the data while it’s being moved in Snowball, so we save time on the pre-processing side as well.

34
Q

You want to expose a virtually infinite storage for your tape backups. You want to keep the same software as today and want a iSCSI compatible interface. What do you use?

A

Tape Gateway

35
Q

Your EC2 Windows Servers need to share some data by having a Network File System mounted, that respect the Windows security mechanisms and has integration with Active Directory. What do you recommend putting in place as an NFS?

A

FSx for Windows

36
Q

You would like to have a distributed POSIX compliant file system that will allow you to maximize the IOPS in order to perform some HPC and genomics computational research. That file system will have to scale easily to millions of IOPS. What do you recommend?

A

FSx for Lustre