Storage Flashcards

1
Q

Difference between EBS and EFS and Object storage?

A
  • EBS can be attached to one EC2 instance only.
  • EFS can be shared across multiple EC2 instances.
  • EBS Mountable, Bootable
  • EFS Mountable, not Bootable
  • Object storage - collection of objects, not Mountable, not Bootable
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is the consistency of S3 storage?

A

S3 offers strong consistency for creates and eventual consistency for updates.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Types of EBS storage?

A
  • GP SSD ($) gp2, gp3
  • Provisioned IOPS SSD io1/io2
  • Throughput optimized HDD ($$) st1,
  • Cold HDD ($) sc1
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What are the 3 ways of data egress from Glacier?

A

Expedited (5mins), standard and bulk (upto 12 hrs)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is the difference between HA and Fault Tolerant in the context of deploying to AZs?

A

HA just barely meets the SLA. Fault tolerant fully meets the SLAs under failure of an AZ. So 4 servers in 2 AZs is HA but 8 servers in 2 AZs is fault tolerant if a min of 4 servers are required for SLA.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

If you take a snapshot every day and it takes you 10 minutes to recover an instance on failure, what can the best backup RTO and RPO be?

A

RTO of 10 minutes and RPO of 1 day.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Relation between block size and IOPS?

A

block size x IOPS = throughput

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Talk about EBS

A
  • Allocates block storage (Volumes) to instances
  • 1 Volume = 1 AZ but HA in that AZ - all data is replicated within a AZ
  • Different storage types - Magnetic, SSD etc
  • Billed as GB./month
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Dominant Performance Attribute of gp2/io1/st1/sc1?

A

gp2 and io1 - IOPS oriented

st1 and sc1 - throughput oriented

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Can HDD storage used as boot volume?

A

No. Only SSD volumes can be boot volumes.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Why would you choose SSD over HDD?

A
  • SSD is better suited for random IO - databases,

- HDD is better suited for streaming large amounts of data sequentially - log files, big data use cases

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is burst performance?

A

To understand burst mode, you must be aware that every gp2 volume regardless of size starts with 5.4 million I/O credits at 3000 IOPS. This means that even for very small volumes, you start with a high-performing volume. This is ideal for “bursty” workloads, such as daily reporting and recurring extract, transform, and load (ETL) jobs. It is also good for workloads that don’t require high-sustained IOPS.

How does this work? Well, as stated earlier, the gp2 volumes start with I/O credit that, if fully used, works out to 3000 IOPS for 30 minutes. The burst credit is always being replenished at the rate of 3 IOPS per GiB per second. Consider a daily ETL workload that uses a lot of I/O. For the daily job, gp2 can burst, and during downtime, burst credit can be replenished for the next day’s run. Now let’s consider a workload that never consumes more IOPS than the burst. Such a workload will continue to see very good IOPS as long as credits are replenished faster than they are consumed.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What are EBS Snapshots?

A
  • backups to s3 of an EBS volume
  • first backup is full data
  • future snaps are incremental
  • volume can be restored from a snapshot
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What are EBS Snapshots?

A
  • backups to s3 of an EBS volume
  • first backup is full data
  • future snaps are incremental
  • volume can be restored from a snapshot
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is FSR?

A

Fast Snapshot Restore - to immediately populate a volume from a snapshot, else populating is done lazily upon demand
- up to 50 FSR per region (50 snap-to-AZs)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

EBS Encryption

A
  • Uses KMS to store encryption keys (DEK)
  • Accounts can be set to encrypt by default
  • each volume uses 1 DEK
  • encrypted volumes cannot be changed to decrypted
  • curveball question: OS is not aware of encryption - no performance loss - encryption is between the host and the EBS volume
17
Q

Object versioning - what are the states of an s3 bucket for versioning?

A

Versioning can be “disabled”
It can be “enabled”, but then cannot be disabled afterwards
It can be “suspended” and can be re-enabled

18
Q

What are the 3 types of S3 encryption?

A
  • SSE-C - server side customer-provided keys
  • SSE-S3 - server side Amazon S3 managed keys
  • SSE-KMS - server side with customer master keys (CMK) stored in KMS
19
Q

How does SSE-C work?

A
  • Customer is responsible for the encryption keys
  • When storing an object you are required to provide the key along with the data
  • hash of the key is taken and attached to the object, this is a one-way hash
  • when asking for an object you provide the same encryption key, the hash is compared and if they match the data is decrypted and returned back
20
Q

How does SSE-S3 work?

A
  • S3 generates a master key for encryption for all objects
  • you cannot influence this master key, cannot change any options, invisible to you, auto-rotated by s3
  • for every object stored in the encrypted bucket s3 generates an encryption key, encrypts data using the key, encrypts the key itself with the master key and stores it with the encrypted data, discards un-encrypted key
  • no role based separation - s3 admin can view un-encrypted data since S3 manages keys - this can be unacceptable for some regulated industries where separation of role is required (use SSE-KMS)
21
Q

How does SSE KMS work?

A
  • S3 generates a CMK and stores it in KMS or you can also ask it to use a customer-managed CMK that you generated in KMS
  • for every object a DEK is generated from CMK, data is encrypted, and both encrypted DEK and data are stored together
  • for decryption the CMK is used to decrypt the DEK which is then used to decrypt the data
  • decrypted DEK is then discarded
  • if you as a user do not have access to the CMK in KMS or to KMS itself you cannot decrypt the object so role separation is achieved
22
Q

What are the S3 storage classes?

A
  • S3 standard - replication across at least 3AZs, HTTP/1.1 200 OK indicates object has been stored durably, GB/month billing, used for frequently accessed data
  • S3 standard - IA (infrequent access), 3AZs, availability is the same, cheaper to store, per request/charge data-out same as standard, compromises: retrieval fee for every GB, min duration charge of 30 days, min of 128k per object, use for long lived data, dont use for temp data
  • S3 1Z IA - same as above but cheaper than IA, only in one AZ, no replication - risk of data loss if AZ fails, same durability of 11 9s but assuming the AZ does not fail, data is replicated within one AZ, use for long lived data, infrequently accessed, non critical, can be easily replaced (think: replica copies across regions, intermediate data that you can afford to lose)
  • S3 Glacier, 11 9s, 3AZs, 1/5th of the cost, “expidited” retrieval 1-5 mins, “standard” -3-5 hours, “bulk” 5-12 hours, 40KB min charge, 90 day min billable duration
  • S3 Glacier Deep Archive - even more restrictions, 3AZs, 180 day min billable duration, 12 hours to 48 hours restore time
23
Q

What is S3 Select or Glacier Select?

A
  • SELECT SQL-like statements to retrieve parts of an S3 object to reduce bandwidth used on a huge object
    for example
  • CSV, JSON, Parquet, BZIP etc.
24
Q

What is Lifecycle Configuration?

A
  • set of rules on a bucket or group of objects which can take Transition or Expiration actions
25
Q

What does Basline performance of GP2 mean?

A

Every EBS volume has a Baseline performance IOPS based on its size with a min

  • there is a min of 100 IO credits per second regardless of volume
  • 3 IO credits/second/gb , so anything under 33.33 gb is not getting you extra credits
  • every volume starts off with 5.4million credits

GP2 can burst upto 3000 IOPS and that’s the burst rate

26
Q

What is EFS

A

Elastic File System is an implementation of NFSv4
Mounted on many EC2 instances
Can be mounted on Linux (exam)
Isolated to the VPC that it is provisioned into, but can be mounted across VPCs with some special steps
Access is via mount targets

Types: General purpose (default) and MAX IO performance (scale to high throughput but high latency)
Bursting and IO provisioned throughput modes
Storage classes: Standard and IA classes (infrequent access)
Lifecycle policies can be used with the classes

27
Q

What are IOPS

A

Input Output Operations Per Second

  • one IOP is a 16kb chunk of data transferred in 1 second
  • if you transfer 160kb of data that represents 10 IOPs
28
Q

What does Basline performance of GP2 mean?

A

Every EBS volume has a Baseline performance IOPS based on its size with a min

  • streaming into the bucket is a min of 100 IO credits per second regardless of volume
  • 3 IO credits/second/gb , so anything under 33.33 gb is not getting you extra credits

GP2 can burst upto 3000 IOPS and that’s the burst rate

29
Q

What is EFS

A

Elastic File System is an implementation of NFSv4
Mounted on many EC2 instances
Can be mounted on Linux
Isolated to the VPC that it is provisioned into, but can be mounted across VPCs with some special steps
Access is via mount targets