Storage Flashcards

1
Q

EBS

A
  • Network drive you attach to ONE instance only.
  • Linked to specific AZ, the only way to move is do snapshot and restore
  • Volumes can be resized
  • Best performance when EBS and instance type are well matched
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

EBS Volume Types

A
  • gp2: General (cheap)
    • 3 IOPS/GB, min 100 IOPS, burst to 3000 IOPS, max 16000 IOPS
    • 1GB - 16TB. +1 TB = + 3000 IOPS
  • IO1. Provision IOPS (expensive)
    – Min 100 IOPS, Max 64000 IOPS (Nitro) or 32000 (Other)
    – 4GB - 16TB, Size of volume and IOPS are independent
    For databases
  • ST1. Throughput optimized HDD
    – 500GB - 16TB, 500 MB/s throughput
    For data analtics
  • SC1 Cold HDD
    • 250GB - 16TB, 250MB/s throughput
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

EBS RAID COnfigurations

A
  • RAID 0, distributed - faster but in case of failure half is lost
  • RAID 1, replicated - same speed, in case of failure no data loss
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

EBS Snapshots

A
  • Incremental
  • Use IO so don’t run while application is using a lot of traffic
  • Stored in S2 (Not visible)
  • Not necessary to detach volume to snapshot, but recommended
  • Can copy across region for DR
  • Can create AMI from snapshot
  • EBS volumes restored by snapshots need to be pre- warmed (fio or dd command to read entire volume)
  • Can be automated using Amazon Data Lifecycle Manager
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Local EC2 Instance Store

A
- Physical Disk, very high IOPS
up to 7.5TB, stripped can reach 30TB
- Block storage
- Cannot be increased in size
- Risk of data loss of hw fails
- Ephemeral, lose, stop or terminate EC2,  instance lose storage
- Survives reboots
- Good for buffer, cache, scratch 
data
- Manual backups
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

EFS

A
  • Linux based only, POSIX, NSF4
  • Good for data sharing, cms
  • Control access using SGs
  • Encryption at rest with KMS
  • Only one VPC, but can create one mount target per AZ for redundancy
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

EFS Scale

A
  • 1000s concurrent NFS client, 10GB+/s throughput

- Grow to petabyte scale

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

EFS Performance

A
  • General
  • Max IO
    Set at EFS creation time
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

EFS Throughput

A

Bursting, linked to FS size

Provisioned IO, expensive high throughput to size ratio

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

EFS VPC peering

A

EC2 can be in another VPC and connected using VPC peering

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

EFS on-prem

A
  • Can be connected using Direct Connect and/or VPN

- Can be accessed using Mount Target IPv4, hostname not supported

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

S3 vs DynamoDB

A

No indexing facility on S3

  1. Use S3 event to notify lambda
  2. Lambda reads from S3 using by fetch and inserts metadata and indexed data into DynamoDB
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

S3 vs EFS

A

S3 is not good for POSIX or file locking use EFS instead

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

S3 Replication

A
  • For latency, for DR, for security
  • Cross Region
  • Same Region
  • Can combine with lifecycle policies
  • Must enable S3 bucket versioning
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

S3 Event Notifications

A
  • Delivery in seconds, but can take up to minutes
  • If two events same time non versioned object, possible only one event will be fired
  • To ensure event for every successful write enable versioning
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

S3 CW

A
  • When CloudTrail enabled, records all bucket level API calls by default
  • Can enable object level by enabling CloudTrail on bucket
17
Q

S3 Baseline, Performance

A
  • 3500 PUT/COPY/POST/DELETE per second per prefix

- 5500 GET/HEAD per second per prefix

18
Q

S3 Performance Optimizations Upload

A
  • Multi-part upload, parallel uploads
    • recommended for > 100MB
    • must use for > 5GB
    • retries chunk not whole
  • Transfer Acceleration
    • Compatible with multi-part upload
    • Go to edge location, from there use fast AWS network to upload to regional bucket
19
Q

S3 Performance Optimizations Download

A
  • S3 Byte-Range Fetches
    • Parallelise gets by requesting specific byte
    • re-request byte chunk not whole, better resilience
    • Can be used to retrieve only a part of the file, ex. head
20
Q

S3 Select & Glacier Select

A
  • Optimise data transfer size by doing server side filtering using SQL
  • Less network transfer, less CPU cost client-side
  • Cheaper
21
Q

S3 Cloudfront

A
  • Require that your users access your private content by using special CloudFront signed URLs or signed cookies.
  • Require that your users access your content by using CloudFront URLs, not URLs that access content directly on the origin server (for example, Amazon S3 or a private HTTP server). Requiring CloudFront URLs isn’t necessary, but we recommend it to prevent users from bypassing the restrictions that you specify in signed URLs or signed cookies.