S3 - Simple Storage Service Flashcards

1
Q

What is S3?

A

A place to store your files.

  • Object based storage
  • File size can range from 0 bytes to 5TB
  • Unlimited Storage
  • Buckets

Not sutable to install OS on

Newly created buckets are private but you can set up acces control via

  • Bucket Policies
  • Access Control Lists
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Do S3 bucket names have to be unique?

A

Yes. Universally namespace because you’re making a web address. ex: https://britzer.s3.amazon……

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

S3 Objects Look like:

A
  • Key
  • Value
  • Version ID
  • Metadata
  • Subresources
    • Access Control Lists
    • Torrent

200 HTTP status Code for successful upload

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

How does data consistency work for S3?

A
  • Read after Write consistency for PUTS -> Immediate access to whatever was uploaded
  • Eventual consistency for overwrite PUTS or DELETES -> takes time
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

S3 Guarantee for lost data

A

99.99% or 11x9s

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

S3 Features

A
  • Tiered Storage ****
  • Lifecycle Management - Lifecycle Policies
    • Automates moving your objects between different storage tiers
    • used with versioning
    • current/previous versions
  • Versioning
  • Encryption
  • MFA Delete
  • Secure Data via Access Control Lists & Bucket Policies
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

S3 Storage Classes/Tiers

A
  1. S3 Standard
    • 99.99% avaliable
    • 99.9999999999% durable (11 9’s )
  2. S3 - IA (Infriquently Accessed)
    • For data that is accessed less frequently but requires rapid access when needed
    • Lower fee than Standard but with a retrieval fee
  3. S3 - One Zone - IA
    • Lower cost
    • Lower avaliablility zones
  4. S3 - Intalligent Tiering
    • Optimizes costs by automatically moving data to the most cost effective teir.
  5. S3 - Glacier
    • Low cost
    • retrieval times are configureable from mins/hrs
  6. S3 - Glacier Deep Archive
    • Lowest cost
    • 12 hour retrieval time
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

How are you charged for S3?

A
  • Storage
  • Num Requests
  • Store Management Pricing
  • Data Transfer
  • Transfer Accel
  • Cross region replication
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

S3 Costs

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

S3 - Encryption

A

You can encyrpt at the object level or bucket level.

  • In Transit
    • HTTPS - SSL
  • At Rest
    • Encript data being store
    • Server Side
      • S3 Managed Key - Amazon Managed - SSE-S3
      • AWS Key Management Service (SSE-KMS)
      • SSE with Customer Provided Key (SSE-C)
    • Client Side
      • Encript then upload yourself
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

S3 Versioning

A

Stores all versions of an object

Once enabled can’t be disabled

Versioning has MFA to delete capability

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

S3 Lifecycle Management

A
  • Automates moving objects between different storage tiers
  • Can be used in conjunction with versioning
  • Can be applied to both current and previous versions
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

S3 Object Lock/Glacier Lock

A

Governance Mode - Can’t change with special permissions - S3

Compliance mode - Can’t alter even root user for a specific retention period for a specific version -S3

S3 Object lock && Glacier Vault lock=== WORM (write once read many)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

S3 Performance

A

S3 Prefix

mybucketname/folder1/subfolder/myfile.jpg

More prefixs better performance

Multipart uploads

Byte-range fetches -> splitting downloads

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

S3 Select / Glacier Select

A

Enables application to retrieve a subset of data using simple SQL expressions.

400% performance increase. Save money

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

AWS Organizations

A

Organizations

Allows you to have mutiple AWS accounts and be able to centrally manage them.

Enable disable AWS services using Service Control Policies (SCP)

Consolidated Billing

One consolidated bill per AWS account

easy to track

volume pricing discount

17
Q

Sharing S3 Buckets

A
  • Using Bucket Policies & IAM (entire bucket) Programmatic
  • Bucket ACLs & IAM (individual objects) Programmatic
  • Cross-account IAM Roles. Programmatic AND AWS Console.
18
Q

S3 Cross region Replication (ex: US –> Japan)

A
  • Versioning must be enabled for both the source and destination bucket
  • Old files aren’t replicated automatically but…
  • All new files will be replicated automattically
  • Delete markers are not replicated
  • versions and deleting individual versions will not be replicated
19
Q

S3 Transfer Acceleration

A

Utilizes the Cloudfront Edge Network to accelerate uploads to S3.

Instead of uploading directly to a bucket you can use a URL to upload to an edge location which will then transfer to your S3.

20
Q

What is AWS DataSync?

A

Allows you to move large amounts of data from on-premise into AWS

Used with NFS and SMB compatible file systems

Replication can be done hourly, daily, weekly

21
Q

CloudFront

A

Type of Content Delivery Network (CDN) which is a system of distributed server/network, that delivers web pages and other web content to a user based on the geographic locations of the user.

Can be used to deliver your entire website, including dynamic, static and streaming and interactive content using a global network of Edge Locations

Request for your content is automatically routed to the nearest Edge Location.
Content is delivered with the best possible performance.

<em><span>Side Notes:</span></em>

  • Edge locations are not just READ only - you can write to them too (ie. put an object on them)
  • Objects are cached for the life of the TTL (Time To Live)
  • You can clear chached objects by invalidating them, but you’ll be charged

<em><span>Key Terms:</span></em>

<span><strong>Edge Location:</strong> location where content can be chached - separate to an AWS Regions/AZ</span>

<span><strong>Origin:</strong> This is the origin of all the files that the CDB will distribute. Can be an S3 Bucket, an EC2 Instance, Elstic Load balancer or Route53</span>

<span><strong>Distribution: </strong>This is the name given to the CDN which consists of a colection of Edge Locations</span>

22
Q

CloudFront Distribution Types

A
  • Web Distribution - Websites
  • RTMP - Media streaming
23
Q

CloudFront Signed URLS & Cookies

A

Always used signed URLs/Cookies when you want to secure content to authorized users

URLs VS Cookies

  • Signed URL is for idividual files. 1 File = 1 URL
  • Signed Cookie is for multiple files 1 Cookie = multiple files

CloudFront Signed URL

  • Can have different oigins. Doesn’t have to be EC2 (but reccomended)
  • Utilize chaching and can filter
  • Key pair

S3 signed URL

  • Issues request from IAM (grant same permissions)
  • Lifetime
  • Good for a small amout of files
24
Q

Snowball | Snowball Edge

A

Snowball

Petabyte-scale data transport solution that can physically transport data on-off locations for a fraction of the cost.

  • 50/80 TB options
  • Super secure
  • Storage
  • Import/Export to/from S3

Snowball Edge

  • 100 TB
  • Same as Snowball
  • Also allows for compute functions
  • Portable AWS Suite

Snowmobile

  • 100PB
  • Truck sized moveable data
25
Q

Storage Gateway

A

Replicates data from a data center into AWS Cloud

  • File Gateway -> Flat files stored on S3
  • Volume gateway -> Cached Volumes/Stored volumes
    *
26
Q

Athena

A

Interactive query service which enables you to analize and query data located in S3 using standard SQL.

  • Serverless. Pay per query
  • No need to setup
  • Works directly with data stored in S3

Can be used to

  • Query logs
  • generate business reports
  • analize cost
  • run queries on click stream data
27
Q

Macie

A

Security service that uses Machine Learning to NLP(Natural Language Processing) to discover classify and protect sensative data stored in S3

  • Uses AI to recognize S3 object that might contain PII
  • Dashboard/alerts/reporting
  • Can analyze CloudTrails logs

<em>Notes</em>

PII (personally Identifiable Information) - personal data

28
Q
A
29
Q
A